For over three years now, I’ve been getting increasingly involved with research projects that involve the online far right in one way or another. One of the most interesting ways that I’ve developed as a researcher during this time is having to think through in greater detail my commitments to research ethics. Because my research typically focuses on public social media data, I am rarely required to obtain informed consent from those whom I study.
I hadn’t realized so many academics were working with data brokers. It’s kind of scary! The EFF has some good points here about so-called “data for good”—and rightly brings up that ethics review boards should be thinking about this sort of thing.
link to ‘Bad Data “For Good”: How Data Brokers Try to Hide in Academic Research | Electronic Frontier Foundation’
I got word that a recent publication of mine was now published in an issue of Learning, Media, and Technology. It has actually been available online first for the past ten months, but since I haven’t been good about blogging about recent publications, I figured this was as good a chance as any to write a post about it. This piece is called “Lifting the Veil on TeachersPayTeachers.com: An Investigation of Educational Marketplace Offerings and Downloads” and is a collaboration with Catharyn Shelton, Matt Koehler, and Jeff Carpenter.
What is the most soothing form of digital data collection, and why is it forum scraping?
Just because you can topic model something doesn’t mean it actually tells us anything (and please don’t ever describe computational text analysis as “objective”).
One of those afternoons where I’m auditing someone’s analysis code, but it’s an analysis of 4M rows of data, so I’m also doing spurts of grading while I wait for code to execute.
35 GB of data is a lot to begin with, but when it’s 35 GB of CSVs? That’s when it starts to really register.
I got a reminder today that I do the kind of research where something as hilariously unintuitive as telling a program to treat long numbers as “words made up of 0-9” is actually a critical step to making sure you get the right results.