Ten simple rules for teaching data science
A new paper from Tiffany Timbers (UBC) and Mine Çetinkaya-Rundel (Duke), and lessons I learned from my time with Software Carpentry
Way back in 2013 I joined Software Caprentry as an instructor. SWC teaches scientists foundational data science skills. Meta: SWC also teaches scientists how to teach other scientists foundational data science skills. I carried the teaching philosophy I learned from SWC forward in several graduate courses and workshops I used to teach, which eventually made its way into a short textbook I wrote for this series.
I’ve thought a lot about what “teaching data science” actually means when your learners are scientists. Not future software engineers, not aspiring statisticians, but working biologists, clinicians, and data-adjacent researchers who mostly want to answer a concrete question and get back to their bench or their patients.
It’s for these reasons I enjoyed the new preprint from Tiffany Timbers (UBC) and Mine Çetinkaya-Rundel (Duke): “Ten simple rules for teaching data science.” It’s a compact statement of many things I’ve learned the hard way over the years, plus a few framing moves that are easy to forget when you’re deep in the weeds of course logistics.
Timbers, T. A., & Çetinkaya-Rundel, M. (2026). Ten simple rules for teaching data science (arXiv:2602.02874). arXiv. https://doi.org/10.48550/arXiv.2602.02874
Teach data science by doing data analysis
Use participatory live coding
Give tons of practice and timely feedback
Use tractable or toy data examples
Use real and rich, but accessible data sets
Provide cultural and historical context
Build a safe, inclusive, and welcoming community
Use checklists to focus and facilitate peer learning
Teach students to work collaboratively
Have students do projects
If you’ve taught a Carpentries workshop or any kind of hands-on data science course or workshop, you know the vibe: live code, narrate your thinking, make mistakes on purpose (or at least don’t hide them), and normalize debugging as part of the craft. The paper explicitly advocates participatory live coding, and the rationale matches what I liked to do in my own courses. Students don’t just need “the right code,” they need to watch an expert steer through the messy middle. Watching me bang my head against the wall finding the missing parenthesis or navigate some dependency hell in real time is often more valuable than seeing a polished slide deck run flawlessly.
Another Carpentries lesson I still carry is the importance of doing something real on day one. Timbers and Çetinkaya-Rundel’s first rule is teaching data science by doing data analysis right away, not after weeks of prep/background material. Give learners a small end-to-end win early, and then use their curiosity and mild discomfort to pull them toward types, functions, data structures, and workflows. In practice, it changes the classroom dynamic. Instead of “trust me, you’ll need this later,” it becomes “you just hit a snag, and here’s the tool that solves it.”
On practice and feedback, the paper argues that we typically underestimate how much repetition students need. Reading data from a file once is not enough. Students should encounter variations of the same task across worksheets, labs, and quizzes, paired with immediate automated feedback through tools like learnr or NBgrader.

One rule the authors say they would promote to first position: building psychological safety. Students who fear looking dumb will not ask questions. The Carpentries’ code of conduct, which I became familiar with through instructor training, provides an excellent template.
It’s a good paper, and it’s a quick read. Check it out.
Timbers, T. A., & Çetinkaya-Rundel, M. (2026). Ten simple rules for teaching data science (arXiv:2602.02874). arXiv. https://doi.org/10.48550/arXiv.2602.02874

Not only for teaching - these rules apply to data science teams of all sorts (many of these are part of the culture of my old team in pharma research)