Spot on! Had these same thoughts trying to write my own mini data processing workflow. I think every scientist faces it at some point if they do any more
Complex experiments. Most like my previous advisor solved it largely through simple excel sheets and discipline. Solving it with code becomes about as comprehensible as a adhoc build system.
Out of curiosity, does your UI allow any form of "simulation", particularly a random fuzzing/monte carlo trials? Having the ability to simulate how a complex rule set plays out can help weed out errors before starting real experiments.
P.S. I admire the work y'all are doing at benchling! Bio-sciences are surprisingly anachronistic when it comes to performing research.
> Out of curiosity, does your UI allow any form of "simulation", particularly a random fuzzing/monte carlo trials?
No, at the moment we don't really have enough structured knowledge of the underlying steps to be able to do predictions/simulations like that. We're taking existing scientific workflows and adding structure to them, but there are still plenty of details that are unstructured and performed manually by scientists, and there's a lot of variability across use cases.
Monte carlo trials certainly seems interesting, though, and hopefully something we can explore more as the procedures get more structured. A similar idea that's a bit closer in reach is to do an analysis on past variations of the same experiment to predict how a newly-designed experiment will go. Possibly we could do a simulation based on historical data like that, although modeling it correctly certainly sounds tricky.
> We're taking existing scientific workflows and adding structure to them, but there are still plenty of details that are unstructured and performed manually by scientists, and there's a lot of variability across use cases.
That makes sense. It’s difficult to start adding structure to biological sciences. And to the fields credit I don’t think it’s out of lack of ability or effort but just the true complexity of dealing with biological systems. In contrast, even quantum mechanics is seemingly simple!
> Possibly we could do a simulation based approach on historical data like that, although modeling it would correctly certainly sounds tricky.
True, seems like you will have a very valuable historical dataset though. I’ve been pondering this type of problem for a few years since my (abortive) attempt at a PhD in material sciences. For my project I made some decent progress using various Bayesian and bootstrapping models to work with uncertainty in my models and tissue samples. Bayesian approaches, especially combined with MCMC type analysis can yield a lot of fruit dealing with biological systems with minimal need to understand the underlying models. But as you mention modeling the parameters correctly is challenging and there’s a lacunae in research broaching the topic. But at the least a system like yours could indicate items like conditional probability of failure given certain sequence sets or combinations of experimental procedures (e.g. cell lines A tend to do better when cultured in X media and tested with Y +/- Z hours). The article seems to indicate you’d have most of the data described in a format that’d be amendable to such data mining.
Out of curiosity, does your UI allow any form of "simulation", particularly a random fuzzing/monte carlo trials? Having the ability to simulate how a complex rule set plays out can help weed out errors before starting real experiments.
P.S. I admire the work y'all are doing at benchling! Bio-sciences are surprisingly anachronistic when it comes to performing research.