Jake Beal's Next Step: October 2016

Sunday, October 30, 2016

Ladies and Gentlemen, we have our ruler

Dear readers, as promised earlier this week, it is my distinct pleasure to share with you today the headline results of this year's iGEM interlab study. These are preliminary results that we are still writing up for publication, but I feel that it is too exciting to keep quiet, and so I am shouting it from the rooftops today.

In synthetic biology, one of the best tools for studying the behavior of cells is fluorescence. Unfortunately we haven't had a good, accessible way of quantifying fluorescence, so you couldn't compare results from one lab to another, or even necessarily from one experiment to another within the same lab. In short: we've needed a good ruler for measuring fluorescence.

The goal of the iGEM interlab studies has been to understand where the problems in measurement are coming from and then to use that knowledge to produce a good ruler. In the 2014 and 2015 interlab studies, we figured out that the big source of the problems didn't seem to be the biology, but how people were using their instruments and comparing their data. That was actually good news, because we had some ideas for how we might be able to fix that, and this year we tried them out. We gave every team two simple non-living calibration samples to compare their biological samples to, and hoped that this would tighten up the numbers some.

The results we got were beyond my wildest dreams.

Here's what we saw for the precision of measuring fluorescence with plate readers. We compared the standard deviation of the arbitrary unit measurements from 2015 with the calibrated measurements from 2016 before and after using the positive and negative controls for quantitative filtering of problematic tests, and got these standard deviations:

Smaller numbers are better, so you can see that we got a big improvement in accuracy from calibrating, and another big improvement from using the fact that we have real numbers to quantitatively exclude obvious protocol failures (e.g., 1000 uM FITC/OD fluorescence from a negative control).

But to really wrap your head around how big an improvement this is, you have to think about the fact that the thing that we are measuring is geometric standard deviation, for which the units are in times multiplied or divided. In general, we would consider the normal range of values to expect from a measurement to be within two standard deviations up or down from whatever the real number is---i.e., 95% of the time, a measurement should lie within that range. With a geometric standard deviation of 35, two standard deviations up is 35*35 = 1,225. That's more than a thousand. Going down is another multiple of more than a thousand, meaning that all told, we would expect measurements to be accurate within a factor of a million or so. Obviously, that's rubbish. It's hard to do anything if you expect your measurements to be wobbling around by a factor of a million.

This year's measurements were more than 100,000 times more precise.

OK, you say, that's all well and good, but since the units were arbitrary before, nobody ever claimed that these numbers should be the same in the first place. Many people who measure fluorescence and don't calibrate their measurements try to deal with this problem by normalizing to a positive control. The idea is then that you can say: "This is 2.7 times the control" and somebody else can hopefully measure the same control and find the same ratio. Indeed, in last year's interlab we got pretty good results for comparing ratios of strong promoters, but we got terrible results for comparing the weak ones.

Here's the thing, though: remember that only about half of our improvement came from using the same units, while the rest came from being able to identify failures by the strange behavior of their controls. We would thus expect to see significant improvement in precision this year versus last year, and indeed that is what we saw:

On average, normalized measurements were 70 times more precise.

The flow cytometry results are not quite as dramatic, and not as statistically strong since many fewer teams to took flow cytometry data, but the basic result is the same: orders of magnitude improvement in precision of both individual measurements and of normalized measurement.

There's a lot more to do, to go from initial results to routine and effective usage, but I believe the core results are clear:

We've got a workable ruler for fluorescence. It's not perfect, but it's orders of magnitude better than current practices.
If undergraduates and high school students from all around the world can use these methods, there's no reason they can't be adopted in every biology laboratory that measures fluorescence.

Sunday, October 23, 2016

Never work without a net: why units matter

I've been working on measurement and units in synthetic biology for more than five years now, so it would seem that I should have a pretty clear understanding of the landscape. As you, dear reader, may recall, for a long time I've been arguing in favor of getting independently calibrated units into our work in synthetic biology, and working on ways to make this generally accessible. Over the past 24 hours, however, I've come across something that has blown my mind.

The arguments for having units that you can compare across different experiments, devices, and laboratories have been pretty strong and clear, since yes, of course, we want to be able to compare our work. Many people, however, believe that it's good enough to have relative units, where you measure in arbitrary units and then normalize your data by a known genetic construct control. I have not been comfortable with this, because you have no way to know if something goes wrong that affects your control as well.

My arguments there, however, have always felt relatively weak, because: a) how do I know if this actually happens often enough to be a real concern? and b) it sounds like I'm accusing people of doing sloppy lab work, which would be doubly unfair since most scientists I know are quite careful and since I don't do any lab work at all. So while I've had a clear argument against relative units that it persuasive to those who are already basically in agreement, I haven't really had a leg to stand on, scientifically speaking, in my concerns about relative units, and have sort of dismissed it down to a level of secondary concerns.

But in the data from this year's iGEM interlab, we have hard evidence that relative units are not enough, because having a way to catch the results of those little mistakes really matters. I can't give you any more details for a week, not until after we officially unveil the results next week at this year's iGEM Jamboree, but it's a big deal. Like, orders of magnitude big deal.

My world is rocked. It's obvious in retrospect, and I've even made the argument before. There's a difference, however, between making an argument and having data staring you in the face that says that the argument is far more important than you ever had actually realized.

Basically, if you're not using units, then you're working without a net. With units, you have a chance to apply a bit of experience and common sense and realize that something is going wrong with your numbers. You might not know how or why, but usually that's not actually important, because usually it's something small and stupid (dropped a minus sign, got left and right mixed up, grabbed the wrong bottle, etc.), and the best way to fix your mistake is just to do it again, because you probably won't make the same stupid mistake twice in a row. This applies to pretty much everything in life involving numbers and measurements, not just to biological research. Using properly calibrated units gives you a second chance to notice your mistake, and makes all the difference between embarrassment and disaster.

How big a difference, exactly? Well, I'll just go grab a cake to celebrate and I'll see you in about a week.

Friday, October 21, 2016

Why do we measure biological computations?

If you're making a biological computer, what do you need to know about its parts? That's one of the questions I'm working on as I lead the effort to organize measurement in the Living Computing Project. One of the things that's coolest for me about this project, funded by the National Science Foundation, is that it's being funded by the computer science folks there, and so we get to really focus on questions about the fundamentals of computing with biology. And so I've been asking this question: what is it that I actually need to know, if I want to build a computer using the DNA of a living cell?

It is very easy, in every science, not just biology, to get seduced into performing the experiments that are easy to perform and gathering the data that is easy to record from your instruments. Unfortunately, however, the numbers that you obtain this way often turn out to not be the numbers that you really need, the ones that can actually give you insight into your system and let you build upon it.

Fundamentally, measurement is a matter not of numbers but of communication. Measurements are only meaningful when they are consumed by something that makes use of those measurements. In many scientific experiments, the only thing that you are trying to communicate is your judgement that a particular hypothesis appears to be reasonably sound (look at all those modifying adjectives!), and there's lots of different ways to do that. When we want to build something, however, we need the parts that we are building it out of to communicate signals and numbers that enable us to understand what will happen when we use them together. Like the way that labelling something an 8mm nut communicates that it will mesh with the threads of an 8mm screw, even though that is only one of its many dimensions, and the way the shape of a USB plug tells you everything you need to know about whether its electrical and computational characteristics will be compatible with a given socket. This communication doesn't need to be perfect, just good enough to let us decide whether to use them this way or that way, not to mention whether our project is even reasonable to consider.

So I've been looking at how we are building things, the way people talk about them, how they sketch them and what they struggle with, and I've been writing down my ideas, bit by bit, of a general workflow for building biological computations. Not just analog or digital, not just about chemical messages or memory, but about specifying how we want to manipulate information, in whatever form and with whatever tools. What I've got right now is very simple, but also I find that it is causing me to ask apparently simple questions to which we do not know the answer, like "How do you compare the complexity of a computation to the capabilities of a library of biological regulatory devices?" and when that sort of thing happens, my experience is that the answers may be scientifically exciting.

Monday, October 17, 2016

Bringing more AI into synthetic biology

Much of my work in synthetic biology has been founded on the importation of knowledge and methods from artificial intelligence. I want to encourage others from that background to get into the act too, since I think it will be beneficial to all involved---as long as there is sufficient listening.

People often think about artificial intelligence as being about stuff like robots and foul-mouthed chat-bots, but it's much wider and deeper than that. For example, much early work on programming languages was considered an AI problem of "automatic programming." In fact, one of the common complaints of AI researchers is that as soon as AI has solved a problem, it gets classified as "not really artificial intelligence" simply because the solution is now understood.

So what are the skills and capabilities of artificial intelligence, that it can bring to other fields? My colleagues Fusun Yaman and Aaron Adler started this discussion in earnest with a talk at AAAI a couple of years ago: "How can AI help Synthetic Biology?", following this up with a paper on "Managing Bioengineering Complexity with AI Techniques" and a workshop last year on "AI for Synthetic Biology" at IJCAI, one of the main conferences in the field. It turns out that, building on core areas like knowledge representation, machine learning, planning and reasoning, robotics, etc, there are, in fact, a great wealth of possibilities for AI applications in synthetic biology, from data integration to protocol automation, from laboratory management to modeling, and many more.

The main challenges are, more than anything else, friction at the interface between fields and getting people to listen well enough to understand which problems are useful to solve (so the AI practitioners aren't too naive about biological realities) and what types of things AI can realistically contribute (so the biologists don't view it as either magic or "just data processing"). My experience has been that it's heavy going to get connected (as is generally the case for interdisciplinary research), but that the opportunities are great, and I encourage my fellow practitioners come and get involved.

Tuesday, October 11, 2016

How to Shoot Good Pictures from a Plane

Today's topic, dear readers, is how to shoot interesting and decent pictures from airplanes. As those of you who read this blog regularly may know, I travel fairly frequently for work (although I am trying to cut down). I also enjoy some dabbling with photography, and so one of my frequent subjects of photography is travel, particularly from up in the air while in an airplane.

One of Harriet's stuffed animals contemplates the view while traveling with me.

I have long held that if I ever stop enjoying the view from the air, then I'll know my soul is truly dead. So far, not dead (though I've had a couple of close scrapes, still). Part of what keeps me enjoying these views is the fact that you can see so many strange and unexpected things below, if you look carefully. Complex stories and geometry form in the ordinary landscape, even something as "flat" as the cornfields of Iowa, and there are many beautiful and mysterious things hidden in the interstices of the world.

Seeing something interesting, however, is a long way from being able to effectively capture it in your camera and to convey that same feeling of interest to others. Our eyes and brains are very good at compensating for distortions and patching around obscurations that pop out and destroy the view when captured in a photograph. Here, then, are my tips for capturing interesting images from an airplane:

Sit forward of the airplane wing: Obviously, you don't want to be sitting over the airplane wing, as it will blot out most of your view. You also don't want to sit behind the wings, however, because the hot exhaust from the engine creates large areas of rippling visual distortion. On big planes, sitting in front of the wing may cost you a little extra if you aren't a frequent flyer, but on small regional flights you can generally pick any seat.
Be mindful of the window: Shooting through a window makes your life much more difficult: you need to deal with reflections of yourself, the camera, and the cabin; the rounded window frame is obnoxious to rectangular images, and the window near the border often causes significant distortion of the image. I find that these can often be remedied by moving myself and the camera with respect to the image. For example, with reflections, sometimes I can get out of the way, while other times I move to uniformly shadow the area that I am shooting through. The contortions needed, however, are sometimes quite significant.
Takeoff and landing are key: Cameras are allowed during takeoff and landing, as they fall into the same category of "personal electronics" as music players and phones. These are some of the best times to shoot, since you are closer to the ground and have more interesting angles on the infrastructure that you pass.
Haze can be helped with post-processing: when you are high up, there is an inherent haze from the amount of atmosphere between you and subjects on the ground. This can be helped, to some degree, by post-processing; programs like Adobe Lightroom have specific mechanisms to help with haze. They're no panacea, but they can certainly bring the image you get from your camera closer to what your eye was feeling.
Always, always, always have your camera out: Wonderful images appear without warning and vanish in a heartbeat, especially at takeoff and landing, and you can't exactly ask to stop or go back to find the angle that you want again.
Keep track of your location: It helps to know where you are, in order to be able to better interpret what you are seeing. When there is a seat-back entertainment system, this is pretty easy, since they generally have a map built in as one of the functions; without it, have a map (even the crappy one from the airplane magazine can help a lot) and keep track of the time since takeoff in order to be able to at least roughly estimate where you are.
Don't forget to look close to home: Despite flying into and out of it many dozens of times, I'm still finding interesting things within just a few minutes of the Eastern Iowa Airport.
Be prepared for possible disappointment: Despite all preparations, sometimes it's just hopeless. Your window may be heavily scratched, smeared, fogged, or iced. There may be fog nearly down to the ground and nothing but utterly bland and boring clouds from above. When this happens, there's nothing you can do, any more than you can about a bad sky when you're on the ground, so simply cultivate what tranquility you can.

And so, my dear readers: go out, and share your visions! Here are a few my own personal favorites (all also already posted on my photo blog):

Snow shadows, Eastern Iowa

Peace-sign neighborhood, outside of Chicago

Boston cargo docks

Thursday, October 06, 2016

To PC or Not To PC

As I'm sure a lot of scientists do, I get a lot of requests to join program committees and review papers. It's always a mixed blessing: on the one hand, I like to help out and it exposes me to a lot of interesting ideas. On the other hand, there are never enough hours in the day. So: how does one decide when to triage?

My reviewing commitments (not counting as an organizer) show a clear need for careful triage.

For myself, I tend to come down on the side of saying yes. Maybe it means I'm stretched a bit thinner, but I frequently find the investment of time to be worth my while, for some combination of the following reasons:

If I want some event in my field to exist, I'd better be willing to contribute to it. Even if it's going to exist anyway, being willing to serve helps make sure that the things I'm interested in have a fair and interested evaluation.
Reviewing papers exposes me to a wider swath of the literature. I always need to read more, but there's always so many other things competing for my time that it often slips. Reviewing a paper, I have to pay real attention, too, which forces me to engage with material that I might otherwise have just skimmed over.
Reviewing papers introduces me to new communities. There are places and people that I now keep track of regularly who I was first introduced to when they invited me to review or when I was invited to review their work.
Reviewing papers challenges me. If I don't like a talk at a conference, I can just blow it off and tune into my email. If I don't like a paper, I'd better be prepared to explain myself in a way I'm comfortable defending. This forces me to understand my own views much more deeply than when I'm living in the echo chamber of my own research group and collaborators.
Reviewing papers makes me a better writer. Reviewing exposes me to both a lot of good writing and a lot of bad writing. From the good papers, I can learn other ways to present that are effective but that are different from my own style. As for the bad papers: it's always easier to see the flaws in another's work than in one's own. From the really bad ones, I learn nothing, but I also see a lot of good work presented badly, and a lot of good ideas that have been badly developed, and I can learn from these mistakes.

So if it's a conference I actually go to, I'll definitely say yes; if it's a place I'm generally aware of and interested in, I'll almost certainly say yes; and if it's got people I recognize and an interesting subject, then probably. Though I must say, sometimes I feel I desperately need a way to express myself as fully and vehemently about papers as my daughter used when she was an infant...

Baby Harriet helps with reviewing, back in early 2013

Monday, October 03, 2016

Surviving Life as a Researcher (video)

A video recording of my "Surviving Life as a Researcher" talk is now on youtube. Enjoy!