Tuesday, October 07, 2014

Wednesday, October 01, 2014

A triumph of citizen science at iGEM

I recently had one of those rare moments of true elation, a scientific triumph that actually brought tears to my eyes. It's my triumph, but it also belongs to so many more people than just myself, the biggest experiment that I have ever run in my entire life.

Let me back up a bit and explain.  Every year for the past 10 years, there has been a big event called iGEM - a genetic engineering "jamboree" in which undergraduates spend the summer doing synthetic biology projects, then get together in the fall to show each other what they've done, compare results, and compete for who has done the most awesome project and presentation.  It started small, but grew quite quickly, and this year there are on the order of 250 teams participating from all around the world.

Last year, I started talking with the iGEM organizers about the possibility of starting up a new track in the competition, focused on the problems of measurement.  I've never really been interested in measurement as a subject, myself, but I've found that I am getting more and more invested simply from a need to have good data in order to accomplish the things I want to do scientifically.  Putting together a measurement track at iGEM sounded like an interesting idea, and one of the things that made me more interested was the idea that we might ask the teams in the measurement track to participate in an interlab study.  In other words, have a bunch of different labs do the same experiment, so that we could compare data and see how reliable the numbers actually are.  So I put together a nice diverse team of colleagues interested in the idea: Traci Haddock (Boston University) is a close collaborator on precision characterization and genetic engineering, Jim Hollenhorst (Agilent) is a long-time veteran of instrument development, and Marc Salit and Sarah Munro (NIST) are specialists in measurements and standards.  Together we worked with the folks at iGEM HQ, especially Kim de Mora and Meagan Lizarazo, to put a track together.  And then we sat and hoped that people would register.

I was worried we'd have only a couple of teams sign up, and that we'd end in an embarrassing fizzle, but Kim and Meagan kept telling me not to underestimate the iGEM students, and so I kept my fingers crossed.  Almost nobody registered and almost nobody registered... and then the deadline for teams to pick tracks came and suddenly we had 11 teams dedicating their summers to the study and improvement of the science and engineering of measurement.  And the interlab study... well, the best way to show who signed up is to show the map I put together:
Teams participating in the iGEM 2014 Interlab Study
(interactive online version) 
That's 45 teams who signed up, just because they believed that it was important to understand how well we can compare our measurements.  Teams not just from America and Europe, but from all over the world: from Mexico and Colombia and Brazil and Turkey and Kazakhstan and China and Indonesia and on and on. Students who decided to take time away from their main projects, because they thought that this would be fun and important.  Who listened to us and understood, when we explained about the value of repeatability in science, and the importance of building our knowledge on a solid foundation.

So that was the first time that I was blown away.  But we didn't know what would happen when we got the data.  As this was the first year that we did any sort of interlab study, we didn't know what teams would be capable of, what equipment they would have, what cells they would be working with, what knowledge and guidance their supervisors would be able to produce.  So we asked for just a simple thing, to measure three genetic constructs in simple, standard conditions.  We hoped that some teams would be able to do some of the more sophisticated techniques that I have been applying, but knew that they might not be able to.  But we had no idea what quality of data we might actually get.  iGEM is somewhat polarizing in the world of synthetic biology: there are a lot of professors who think that it is wonderful, and who build research programs that are intimately tied with things their students do in iGEM.  Others, including some very high profile researchers, think it's a waste of time and that they can't trust data produced by undergraduates, and won't have anything to do with it.  Me?  I hadn't committed to a position yet: iGEM certainly looked really neat, and I thought the participating students all get an excellent educational experience (which is enough to recommend it right there!), but I'd never been engaged deeply enough to know how sound the science that came out of it was.  Here, as the data came in, I might finally begin to learn.

Saturday afternoon, with data from 2/3 of the teams in hand, Traci and I sat down in a corner of the meeting we were at to find out what we had wrought.  I opened up an Excel spreadsheet and she parsed through the worksheets the teams had sent in to read me out their vital numbers.  The columns filled, and it was clear that some of the teams had gotten reasonable data while others had run into problems---failed cloning or contamination or instrument problems or who knows what.  No surprises there---it would have been surprising if nobody had had trouble. And when we had enough entries in the table, I selected a couple of columns and told Excel to make a plot.

Disappointment scattered itself all over the graph.  Where we would hope to see a nice straight line of points, there was a virtual cloud of incoherence.  Every single graph we tried looked like that: a massive mess of scattered hash.  Looking into the methods sections of the teams' reports, we could see a number of differences, both large and small, and so contented ourselves with the idea that we'd been able to actually quantify just how bad is the effect of lab-to-lab differences in the way that people work with cells and instruments.  Not a bad result, especially for an experiment whose basic aim was to establish a baseline from which to work.

But then... then on the plane back from Boston, I moved the data from Excel to Matlab, where I could really analyze the numbers, and I plotted them again.  And it all lined up.  I had selected the data points to use by a simple rubric for likely validity of the data: are the "high", "medium", and "low" expression promoters in the right order and showing at least some significant difference?  That helped a lot, but even when I included everything, it was nothing like the mess we'd seen before.  I looked again, and discovered that nobody on Earth should ever use Excel to analyze any sort of data.  Those plots that looked so horrible?  That's was because Excel was experiencing an odd bug that caused it to use row numbers for the X axis on the graph rather than the actual values it claimed to be plotting there.  If you replace one of your two variables with arbitrary junk, well of course it's going to be a mess.  But the real data... the real data was so nice.

And that's the thing about science, really, that's sometimes so hard to wrap my head around.  It works.  The universe really does run on rules, and no matter how strange or complicated or hard to understand a thing may be, the substance of reality has no mystical component.  If you can just find the right lever and the right place to stand, everything makes sense, and this fact is indicated by the fact that these points of data make a beautiful line.  It doesn't matter whether we want the result we get or not, and there is no moral implication in the knowledge: what we do about the universe once we understand it is where our humanity comes in.  But even life itself has rules, that we can understand and work with, and this experiment---so simple, yet at the same time so large and complicated---is another step towards doing that.  

What exactly were those results, you ask? Well, I'm sorry, but you're going to have to wait.  We've promised all the teams that we're going to announce the results of the study at the iGEM jamboree at beginning of November, and nobody gets a peek before then.  All I can say for now is: it was worth it, and I am humbled by the remarkable enthusiasm and dedication of young men and women from all around the world who made it possible.