digital labor and generative AI: what Stack Overflow CEO Prashanth Chandrasekhar gets wrong
- 7 minutes read - 1390 wordsThis morning, while getting ready for the day, I spent some time catching up on podcasts, including Nilay Patel’s interview of Stack Overflow CEO Prashanth Chandrasekhar on a recent episode of Decoder (a podcast I’ve spent a lot more time listening to since it went ad free for subscribers). I ditched the Stack Exchange network a year and a half ago over digital labor concerns—I was literally being prevented from deleting my own content from the site, which is bonkers—and I’m honestly not sure why I bookmarked the interview for listening a few days ago. I think it was more than a hate listen, though: For all of my own feelings about generative AI, I make an effort to be open minded, and I was interested in the headline for the interview: “Stack Overflow users don’t trust AI. They’re using it anyway.”
One exchange between Patel and Chandrasekhar really stood out to me, though, and not necessarily in a good way. I was pleased that Patel pushed Chandrasekhar on the question of digital labor, but the CEO’s response to Patel’s question really rubbed me the wrong way. Here’s what Patel had to say:
If I am somebody in your 1 percent who spends a lot of time on Stack Overflow helping other people. The reason I answer questions for free on your platform, which you monetize in lots of ways, is because I can directly see that my effort helps other people grow and that I’m helping other people solve problems. That is one very self-contained dynamic. The last time you were on the show, our entire conversation was about that dynamic and how you got people to participate in that dynamic and the value of it.
Then suddenly, there’s a very clear economic benefit to the company that owns the database because it’s selling my effort to OpenAI, which is happening across the board. It’s going to do these data licensing deals with all these AI providers, they’re going to train on the answers that I have painstakingly entered into this database to help other people, and now the next generation of software engineers is going to get auto-complete that’s based on my work and I’ve gotten nothing. I’ve heard that from lots and lots of people. I’ve heard that in our own community, and I think I have felt that as various media companies have made these deals.
How do you respond to that? Because it feels like you were providing a database that you had to monetize in some ways, but the interaction people had was the value, and now there’s another kind of economic value that is maybe overshadowing, recasting, or re-characterizing the interaction that people have.
Here’s, then, how Chandrasekhar responds, which is an answer smooth enough that it almost carries weight if you don’t think any more about it:
There are a couple of points there. One is about this company’s original DNA and why people came together to do this thing. When I joined the company, I asked a question like, “What’s people’s incentive to spend time doing this?” I asked the founders, specifically [co-founder] Joel Spolsky, about this. His point was that the software development community is very altruistic. People just want to help each other out because people understand how frustrating… I used to write code many years ago. I recently picked it back up with some of the code-generation tools, which is interesting to compare and contrast. I just remember how frustrating it was if you got stuck on something. Stack was a huge boon when it was created to unlock this. It was truly out of that. That was the reason.
Even before ChatGPT, we also asked the question, “Should we incentivize users by paying them? Should we give them a monetary benefit?” That wasn’t a high ask by a user base. We went and researched people. People were not in for the money. Plus, it complicates things because how do you judge the payment for a particular JavaScript question relative to a particular Python question? It goes down a rabbit hole, which is untenable. So that’s one. What was the original reason people got together; it was about the mission.
Here’s the thing, though: Not wanting to get paid yourself is not blanket permission for other people to profit off of your work. I feel this most pressingly as an academic (which is why I’m writing this on my workblog instead of one of my other subblogs). I don’t necessarily feel the need to get royalties every time someone accesses, reads, or cites one of my scholarly publications. I need a paycheck, sure, but I get a paycheck from the University of Kentucky for being a researcher, and I think the model of being paid by public tax dollars (and a crapton of student tuition, but that’s another problem for another post) to generate knowledge and release it freely into the public domain is a good one! (I also acknowledge that this answer might be more complicated for an independent researcher or a scholar in a more tenuous professional position).
Here’s the thing, though: If the ideal of my salary coming from public tax dollars to do this work doesn’t quite hold up, the idea that I release the knowledge I generate into the public domain is straight up laughable. I would be happy—ecstatic even—to release my publications into the world and to never get paid for them, but that’s not how it works. Instead, academic publishing companies claim my labor and the labor of peer reviewers, require me to sign over copyright (or at least some kind of exclusivity claim), and then proceed to charge my own university for access to my work—often at a high profit margin. This strikes me as unacceptable—and that’s before we get to the question of generative AI companies pirating my work from those companies to profit from my labor in their own ways.
So, contrary to what Chandrasekhar argues, there’s a difference between wanting to get paid for one’s work and objecting to others’ profiting off of that work. That one does not need to get paid for their labor does not mean that one consents to others’ profiting off of that same labor. If I were better read on the gift economy, I’m sure I could draw some insightful parallels there, but let me instead use an analogy that I think gets at what I’m trying to say. Let’s imagine that after some kind of natural disaster, Aïcha is in a position of relative privilege and is giving away water bottles and canned goods for free, no questions asked, at a table she’s set up in a public park. Bertrand comes up to Aïcha and claims an armful of water and food, thanks Aïcha, and walks away. Bertrand sets up his own table at a different public park and proceeds to sell the goods he’s received from Aïcha to those who are walking by.
When Aïcha discovers this and gets upset at Bertrand, the problem is not necessarily that Aïcha did not get paid for the water and food that she distributed. She was happy to give the goods away, and she might even feel that it isn’t right for her to accept money from Bertrand (or anyone else) for what she is doing. The problem is that Bertrand has violated the social contract that Aïcha believed that everyone was operating under—he sought to profit off of others’ generosity. When explained in these terms, I don’t think Chandrasekhar’s answer actually addresses Patel’s question (though perhaps Patel could have asked a slightly different question, too).
There’s more I could write about this interview and what parts of which answers I objected to, but this point strikes me as parallel to the “copyright vs. digital labor” issue that I’ve been trying to tease out in recent points, and I thought it was really worth making. My objection (and, presumably, the objection of many Stack Overflow users) to my labor being used to train generative AI is not that I am not being paid for it—it is that others are finding ways to get paid off of it. Chandrasekhar’s defense of the second by way of pointing out the first strikes me as an incomplete answer that does not actually address the problem.
- Stack Overflow
- Stack Exchange
- The Verge
- Nilay Patel
- Decoder
- digital labor
- AI
- generative AI
- Prashanth Chandrasekhar
- publishing
- academic publishing
- gift economy
- copyright
- intellectual property
similar posts:
what is the correct monkey paw threshold?
I have lots of concerns about LLM training, but I think it’s better to think of the issue in terms of digital labor, not copyright. My blog is licensed for reuse, but that doesn’t mean it’s any less exploitative for someone to scrape it all to develop software that will make them rich off my work.
why I think labor, not copyright, is the foundational problem with AI scrapers
I don’t think copyright is the best argument against generative AI (strengthening copyright law will benefit big companies more than small creators), but “can’t make an AI omelette without breaking a few copyrighted eggs 🤷🤷🤷” is still a depressingly cynical national policy.
🔗 linkblog: OpenAI's viral Studio Ghibli moment highlights AI copyright concerns | TechCrunch
comments:
You can click on the < button in the top-right of your browser window to read and write comments on this post with Hypothesis.