Wednesday, July 27, 2022

Multicolor Plate Reader Fluorescence Calibration

Just out in OUP Synthetic Biology, "Multicolor Plate Reader Fluorescence Calibration" extends our prior work on calibrating green fluorescence and cell count to calibrate red and blue fluorescence as well. The results are no surprise (if we can use a green dye, we ought to be able to use other dyes too), but it's valuable to have specific recommendations for dyes to use and to have an interlab study validate that yes, they really do perform as well as the others. 

So everybody out there listening, please start using sulforhodamine-101 to calibrate your red fluorescence and Cascade Blue to calibrate your blue fluorescence! Everybody who uses your data will thank you for providing equivalent molecule/cell estimates rather than irreproductible arbitrary or relative units.

Red and blue fluorescence calibrants were just as precise as the prior green and cell-count calibrants 

The paper also reports on some of the travails we ran into making the study work: some of the fluorescent proteins we wanted to try out didn't work in our hands, and there were miscellaneous other problems: a promoter sequence got messed up,  some things wouldn't synthesize, one of the plasmids seemed problematic, and timing problems meant not all labs could run all constructs.

Problems like that are frustrating, but ultimately I'm happier reporting them than burying them. Remember: if you read a synthetic biology study with lab work and it doesn't talk about failures, it just means they either aren't aware of them or else they've pruned them from the narrative!  Calibration methods like these help us see better when things go wrong and understand what's happened.

Thursday, July 14, 2022

Studying Pathogens Degrades BLAST-based Pathogen Identification

Using the BLAST algorithm to search the NCBI databases is the typical way one goes about identifying a DNA sequence, so it's been the typical way biosecurity systems decide if something is potentially a dangerous pathogen or toxin too. Problem is, that's not what BLAST and those databases were designed for, and we've observed that they aren't working as well for that purpose as they used to, as we report in our new preprint: "Studying Pathogens Degrades BLAST-based Pathogen Identification"

Specifically, we've found an inherent problem that is growing in seriousness due to a non-obvious emergent dynamic. Now that sequencing and bioengineering tools are getting much more accessible, lots of sequences are being studied by modifying them with "tool" sequences like purification tags, fluorescent proteins, stabilizing sequences, etc. Those sequences get (appropriately) classified based on what's being studied, and now you've got chimeric material that includes both the subject of study and the bioengineering tool. Then when you run BLAST on a sequence with that tool, you start finding that tools are classified as what they're used to study.

Example of BLAST classification failure: using a purification tag to study an Ebola protein means that now a fluorescent protein plus a purification tag gets mis-identified as Ebola.

This doesn't seem to be much of a problem for most uses of BLAST against NCBI, but it's poisonous for making biosecurity decisions, since it can cause benign sequences to be classified as dangerous or vice versa. Moreover, the effect gets stronger the more problematic a pathogen is (since more sequences are recorded) and the more useful a tool is (since more chimeric material is produced), meaning that the problem is most likely to occur in the most important.  For example, over the last two years, quite a lot of stuff has started coming back as COVID-19, since everybody in the world is studying COVID-19 with all of the tools that they can get their hands on.

This is a serious problem, and it's not likely to get better, since NCBI and BLAST aren't doing the wrong thing: they're just getting less suitable to use as a short-cut for doing something that they were never designed to do. 

So how do we fix it? Switch to tools that are actually designed for pathogen identification. We've got one (FAST-NA Scanner), and a whole bunch of other folks worked on the same problem in the FunGCAT program. The solutions are there, we just have to help folks switch to them.

Wednesday, July 13, 2022

pySBOL3: SBOL3 for Python Programmers

Our Python library for the SBOL3 standard now has an official citable publication in ACS Synthetic Biology, called "pySBOL3: SBOL3 for Python Programmers." 

The article is a good short read, but for any Python programmers, out there I recommend just jumping straight in with the tutorial instead. Happy hacking, everyone!

Tuesday, July 05, 2022

Functional Synthetic Biology

Synthetic biology isn't about sequences. Don't agree? Tell me what this is without looking it up: atgcgtaaaggagaagaacttttcactggagttgtcccaattcttgttga

Tell you what, I'll give you a hint, make it easy. It's a coding sequence translating to MRKGEELFTGVVPILV. Everybody knows this one, right?

How about this instead?

That's right. That mystery sequence up top is the first 50 bases of BBa_E0040, the widely used iGEM part with a coding sequence for GFPmut3. Now that one, a great many folks working in synthetic biology know, have used in their work, and maybe even have strong opinions about.

Notice that this is a description of biological function: the important thing is that the coding sequence makes a protein that emits a lot of green light when you hit it with a blue laser. There's a sequence in there somewhere but that's not what gets put on the whiteboard or what gets discussed.

Don't get me wrong, sequences are important. But right now we're living with a mis-match in synthetic biology, where most of our discussions about design are about function, but nearly all of our tooling is heavily focused on sequences (e.g., GenBank format), with any information about function tacked on as an afterthought or else confined to specialized databases that each pose their own sui generis integration problem. 

We need a new focus on functional synthetic biology, and that's one of the things we've been working on in the iGEM Engineering Committee. We're trying to change how we do synthetic biology, so that we can pull together the work that lots of people have been doing on calibration, insulation, characterization, context effects, modeling, assembly, etc., in one place and make at least a small class of synthetic biology engineering really simple and predictable.

We aren't there yet, but we've gotten to the point where we think we've figured out some of the important shifts in thinking, representation, and tooling that need to happen in order to make functional synthetic biology possible. If you're interested in this too, I encourage you to read more in our newly available pre-print on Functional Synthetic Biology.

Thursday, May 05, 2022

AI for Synthetic Biology

Several of my colleagues have been organizing an series of "AI for SynBio" workshops over the last few years. I've been to some and they have been both stimulating and enjoyable. Now they have an article out in Communications of the ACM, along with a nice short video in which Aaron Adler introduces this increasingly important cross-disciplinary interaction for folks who aren't familiar with one or both of the subjects.

Friday, April 22, 2022

Talking measurement and standards with "The Living Revolution"

Yesterday I had an enjoyable conversation with Luke Roche and Sara Knurowska, who do a podcast called "The Living Revolution." They'd read some of my work on measurement, which led inevitably to a wide-ranging discussion including fundamental principles in engineering and science, when to standardize (or not), SBOL, etc.

Check out the podcast here (if it works for in your browser), or on Spotify or Apple Podcasts 

Friday, January 07, 2022

Two years of soap

Back in pre-pandemic times, I used to travel quite a lot, and like many other frequent travelers, I slowly accumulated a pile of little bars of complimentary soap from hotel rooms.  As a result, I hadn't actually purchased soap for myself for years. Today, however, I opened my last little leftover travel soap. A curious milestone and statistic: it appears that I'd had just under two years of soap in my little pile.

One of my daughter's stuffed animals traveling with me on my last pre-pandemic trip.

Monday, October 11, 2021

Meeting Measurement Precision Requirements for Effective Engineering of Genetic Regulatory Networks

We've got a new preprint up today, "Meeting Measurement Precision Requirements for Effective Engineering of Genetic Regulatory Networks", that is an unusual mixture of theoretical analysis and interlaboratory study. 

The work started out as an investigation of the replicability of flow cytometry measurements. Flow cytometry, as readers of this blog may know, is one of my favorite biological measurement tools, since it lets us obtain measurements from large numbers of individual cells. I've been involved in a number of projects that have put it to good use in engineering biological devices, and the calibration methods available let us put real, biologically-sensible units on the measurements. But just how good are these measurements and how reproducible?  That's what we set out to study with a consortium of collaborators and about two dozen flow cytometers.

Then we went to go write it up, and a rabbit hole opened beneath our feet, sucking us down into an unexpected set of theoretical questions. We had a number (~1.5-fold precision), but was that a good number? In fact, how do we even decide what a good number is? What do you even need to do good engineering? 

Maybe we should have just called that "future work" and published what we had. But we didn't. We followed that rabbit hole down and the manuscript went into limbo. But when it came out of limbo, the manuscript was standing on its head and had an answer. What started as an investigation of flow cytometry became an investigation of the general requirements for effective biological engineering, with the work on flow cytometry becoming one verified answer for how to meet those requirements.

Basically, you want to be on the left side of the red line.

We ended up with a (highly abstract, conservative) formula for estimating how well one needs to know values in order to engineer gene regulation. And for most state of the art work, it means you need to have a measurement precision somewhere in the range of 1.2-fold to 2.0-fold, with calibrated flow cytometry right smack in the middle.

I'm happy with these dual results, and I think they should be useful to help us move another couple of steps towards a world of reliable and predictable biological engineering.

Thursday, July 15, 2021

Predictable signal amplification with recombinases

New paper out today: "Quantitative characterization of recombinase-based digitizer circuits enables predictable amplification of biological signals." If we ever want to be able to make reliable controller in cells, we need to have well-separated control signals. Many of the biological sensors and other inputs that we work with, however, are really blurry, so we need devices that can clean them up. This paper demonstrates how this can be done in mammalian cells with a circuit that cleans up a poorly separated signal by nearly 3 decibels!

Blurry input (left) is predicted to be separated well by our recombinase device (middle), and that prediction is realized experimentally (right).

This work, part of the NSF Living Computing Project,  involved collaboration across several labs and a lot of work to connect the devices, analytics, and models. Making this work meant really getting down into what we wanted not just biologically but computationally, in terms of the signal properties of the device. The models and metrics guided adjustments in device design that ultimately feed back into a better performing system. I'm personally very happy with the result as an example of a getting really serious about the engineering approach biological systems.