I wrote my first public blog post in 2009. I started Getting Genetics Done to share what I was learning at the end of my PhD/postdoc through my first few years as faculty. Some of the earliest posts were simple, such as how to write and run a simple Perl script, to bigger topics like why it’s usually a bad idea to categorize continuous variables in a linear model. From my days as an educator and as a Software Carpentry instructor, I learned what anyone who’s done any teaching knows: teaching something well requires a deep understanding of a topic and often reveals how much you don’t know about what you’re teaching. Writing a blog, like teaching, is great at revealing the gaps in your knowledge in a way that the regular grind does not. And it’s a great way to learn.
As I progressed in my career I wrote less and less often. It would be easy for me to chalk this up to a lack of time (not untrue), or Twitter having made me a lazy writer (also not untrue). However, I think the biggest thing stopping me from writing more was the ego barrier. I’m a professor… I’m the director of the bioinformatics core… I’m the Principal Investigator… I shouldn’t be this senior in my career and still not know how to do a left join on a subquery in sqlite or whatever. And writing about it, no less! My last post on GGD was back in 2017 (and is, ironically, a woefully out of date guide to staying current in bioinformatics and genomics). Perfectionism leads to procrastination leads to paralysis. After you leave something for long enough it becomes nearly impossible to pick it back up again. I didn’t do any writing/blogging for 7 years until I started this newsletter a few months ago.
Along these lines I recently heard something on this topic that really resonated with me. Simon Willison, creator of Datasette and co-creator of Django, is a veteran software developer for 25 years, and now internet famous for his thoughtful and regular blog posts and Tweets on LLMs, genAI in programming, python packaging, sqlite, data journalism, and many other topics. He was recently a guest on the Software Misadventures Podcast (Apple, Spotify, Overcast, and video version on YouTube). This is such a great conversation and I’d highly recommend listening to the whole episode. Really — bookmark it now, it’s 100% worth your time.
Early in the podcast he talks about blogging as an accountability mechanism. And, starting around the 9:24 minute mark, Simon talks about why he writes regular TILs (“Today I Learned…”). As he describes it, TILs are the most low stakes form of writing that he does. Instead of having to create something new (without AI slop), there’s only one requirement: did you learn something today (or recently). He uses the hypothetical example of being an elite software developer for 25 years and being proud of only just learning how to write a for loop in bash. Here’s the ~1 minute audio clip with an approximate transcription of a relevant section below.
I've got like 25 years of software engineering experience. I feel like it's important to outwardly demonstrate that when you've got 25 years of experience, it's still worth celebrating learning for loops in bash, right? There's that pattern people get into where they don't want to admit that they only just learned how to do something. It's sort of a shame that I didn't know how to do for loops in bash. I like using my reputation to broadcast out, [to] be proud of that. Right. You figured out for loops and bash. Fantastic. There's a million other things that still to learn about everything involving computers, right? It’s no biggie that you didn’t know that already.
Someone else I admire who does this well is Ming (Tommy) Tang. Tommy has a long list of posts on his blog going back years and scores of videos on his Chatomics YouTube channel. Tommy has tons of useful posts and videos on RNA-seq analysis, single cell analysis, specific in-the-weeds posts on things like the OOP system in R or processing GEO data with Salmon, to high-level topics like building a Bioinformatics CV (I strongly second the advice of putting your GitHub link front and center) or a video dedicated to books he found useful for learning bioinformatics (I also second his recommendation of Vince Buffalo’s Bioinformatics Data Skills). Tommy also has a book (From Cell Line to Command Line: How you can transform from a wet biologist to a computational biologist) that you can read about on his website, and he co-hosts the Single Cell World podcast.
I’ve been writing code in bioinformatics for 20 years, and I still find myself looking up on SO or asking an LLM for help with basic syntax on things as simple as an if/else block in Bash or a groupTuple in Nextflow. Perhaps when I actually take the 10 minutes to grok the difference between class methods and instance methods in Python I’ll drop the ego and celebrate it here, decades of experience notwithstanding. I’m hoping to take a little inspiration from Simon, Tommy, and others, and use this space for learning in public a little more often. Even if it’s as elementary as a for loop in Bash.