I’ve said it before, and I’ll say it again: Nothing reminds me as much of teaching French as does teaching programming. It takes a lot of the same metacognition to learn both, and it’s really hard to teach that metacognition.

new edition of my remixed data science textbook

I’m happy to share that the Fall 2023 edition of my remixed Introduction to Data Science textbook is now available on my website. This book adapts material from the “ModernDive” Statistical Inference via Data Science course, Catherine D’Ignazio and Lauren Klein’s excellent Data Feminism, a number of other Creative Commons-licensed works, and some of my own contributions to put together a no-cost, openly-licensed textbook for my data science students. I put together the first edition of this book for last Fall’s version of this course, but the first run through taught me a lot, and I’m very happy about this edition (though I do have a small laundry list of errors to fix, and I’d like to eventually get into some fiddlier bits like removing social media icons from the header).

🔗 linkblog: my thoughts on 'A jargon-free explanation of how AI large language models work | Ars Technica'

Haven’t read this yet, but I’m bookmarking for my classes. link to ‘A jargon-free explanation of how AI large language models work | Ars Technica’

🔗 linkblog: my thoughts on 'Pluralistic: The surprising truth about data-driven dictatorships (26 July 2023) – Pluralistic: Daily links from Cory Doctorow'

Interesting stuff from Doctorow. If I can, I want to work it into my data science textbook for next semester. link to ‘Pluralistic: The surprising truth about data-driven dictatorships (26 July 2023) – Pluralistic: Daily links from Cory Doctorow’

draft syllabus statement on code, plagiarism, and generative AI

I’m spending a chunk of today starting on revisions to my Intro to Data Science course for my unit’s LIS and ICT graduate prograrms. I’d expected to spend most of the time shuffling around the content and assessment for particular weeks, but I quickly realized that I was going to need to update what I had to say in the syllabus about plagiarism and academic offenses. Last year’s offering of the course involved a case of potential plagiarism, so I wanted to include more explicit instruction that encourages students to borrow code while making it clear that there are right and wrong ways of doing so.

Generally, I discourage my intro to data science students from tackling questions they can’t answer at their level of programming, but sometimes I get so interested in the question that I end up writing the code for them so I can see what they do with it.

🔗 linkblog: my thoughts on 'Too much trust in machine translation could have deadly consequences.'

This article provides good examples of how the efficacy and efficiency of a given technology is often less important than deeper questions of reliance and roles. link to ‘Too much trust in machine translation could have deadly consequences.’

ClassDojo and 'data as oil'

The new semester at the University of Kentucky starts on Monday, and I am flailing to try to get my data science course ready to go—including putting together an open, alternative textbook for my students. I’ve been borrowing heavily from Catherine D’Ignazio and Lauren Klein’s Data Feminism for my textbook: It’s a fantastic resource, and I’m hoping my students take a lot from it. Of course, my kid’s semester has already started, and I’ve already blogged a bunch about my frustrations with her new school’s use of ClassDojo this year.

Really leaning into ethics and justice elements of data science in my fall class, and I’m wondering how much pushback I’m going to get. I’ve taught about racism, sexism, and colonization in games in another class with very few complaints, but this feels different somehow.

why 'open access' isn't enough

I just barely microblogged something about what I want to say here, but over the past hour, it’s been nagging at me more and more, and I want to write some more about it. I was introduced to academia through educational technology, and I was introduced to educational technology through a class at BYU taught by David Wiley. This class was not about educational technology, but David’s passion for Web 2.

This summer, I’m remixing an alternative textbook for my Fall intro to data science class, and I’m pleasantly surprised by how helpful Creative Commons-licensed journal articles are proving. Shows that “open access” is only part of license’s benefits.

One of my data science students just did a t-test to demonstrate that evil-aligned monsters in D&D 5e tend to have lower Armor Class than good-aligned monsters. This course demands a lot of effort, but moments like this make it worth it.

Teaching R for the first time, and many students are first-time programmers. I’m reminded of teaching French in terms of how easy it is to take for granted things that aren’t obvious to beginners.

Unsatisfied with the Intro to Data Science textbook I’ve inherited. Fortunately, an earlier version is Creative Commons-licensed, as are some other fantastic resources. Guess who’s going to remix himself a new textbook for next Fall!

I know I’m going to make plenty of mistakes teaching Intro to Data Science for the first time, but one thing I’m already proud of is teaching my students to use tags to format code and output in their Canvas posts.