Below are posts associated with the “academic labor” tag.
why I think labor, not copyright, is the foundational problem with AI scrapers
This morning on Bluesky, I saw some posts about a class action lawsuit against Anthropic for their use of pirated, copyrighted materials in training their generative AI models. One of the sources of these copyrighted materials was the LibGen database, which I took a peek at nearly six months ago to confirm what I was already sure to be true: that my scientific writing was also collected as training material by companies like Anthropic or Meta. I don’t love that big tech companies are profiting off of my work in this way, and I’m sympathetic to the authors who are taking legal action against Anthropic. However, as I’ve written repeatedly over the past few years (you can find some of those thoughts—and others—by scrolling through here, I don’t know that copyright is the right way of responding to this kind of abuse.
thoughts on academic labor, digital labor, intellectual property, and generative AI
Thanks to this article from The Atlantic that I saw on Bluesky, I’ve been able to confirm something that I’ve long assumed to be the case: that my creative and scholarly work is being used to train generative AI tools. More specifically, I used the searchable database embedded in the article to search for myself and find that at least eight of my articles (plus two corrections) are available in the LibGen pirate library—which means that they were almost certainly used by Meta to train their Llama LLM.
🔗 linkblog: More academic publishers are doing AI deals'
I keep thinking about the similarity of exploitation of academic labor by publishers to the exploitation of everyone’s labor by AI companies, and stories like this just make it more clear.