Ok, so why *aren't* jupyter notebooks a solution?! I was gonna suggest that also...

aluren · on Nov 3, 2018

They are so unwieldy. JSON formatting means the ipynb format is a huge mess to handle by simple means, so you have to open a browser window to do anything. Loading, editing, formatting, saving is just... ugh. You can't do proper diffs (though some tools are trying to alleviate that). Loading, stopping and restarting kernels is excruciatingly slow. Because cell execution isn't necessarily done in order it can become very, very easy to lose yourself and feel like you're back in the 70's experiencing GOTOs and the joys of spaghetti code. And that's from my point of view with a technical background. Imagine explaining notebooks to a biologist to whom writing any line of code is still something new and daunting. Imagine their reaction when they happen to mess up one cell in their own execution order and alter all the subsequent workflow, try to grok some of the magic %commands, or get some cryptic error message due to some config file not having the proper rights because they installed a library with 'pip --user' and not 'sudo pip'. It's not realistic to think that an entire community of people who aren't technically minded, many of which actually loathe or fear anything looking like code, is going to adopt a tool that even technically minded people struggle to use they way it's intended.

On top of that, many steps necessary to reproduce a pipeline typically need to load enormous datasets. Terabytes of simulated protein structures, hundreds of gigabytes of sequencing reads, phylogenetic trees, alignment files, what have you. Once you somehow acquire that dataset, you need the appropriate tools, many of which need to be specifically compiled for your platform, then run them onto a powerful machine if you don't want the pipeline to take months to complete, etc. (And that's if you didn't use any proprietary software or any GUI based application with no command line interface.) You can't exactly load all of that into Github+Mybinder and call it a day. You can't ask a community of people that doesn't like coding to learn about Docker containers either.

Nevertheless, we (at our lab) do use notebooks when we can because we know they're fashionable. We can only present parts of our results, though, due to the aforementioned constraints, but it's still pretty looking and people like them, so we write short demos using them.