Monday, December 28, 2015

Why aren't research grants centralized?

A recent question on Academia StackExchange asked something that looked simple to me at first, but turned out to be much deeper and more subtle that I had expected as I thought about it more.  The question is, in essence: "Why aren't research grants centralized?"  

In other words, why do countries generally have messy and complicated research funding systems like we do in the United States, where there are a bazillion different independent agencies and mechanisms for funding research, each with its own peculiar mechanisms and application rules? Wouldn't it make more sense to have some sort of unified science system where all of the science people can go and ask to get their science things funded?  

I enjoyed thinking about and exploring this question, and so I wish to share my answer with you here as well:

First, let us consider why there are many organizations that fund research, rather than a single research-funding organization. This is a matter of evolutionary organizational structure. In most countries, research has a non-trivial budget and applies to many different concerns of government. That means there has to be some (probably largely hierarchical) structure for organizing it. Now, let's consider two prototypical organizational structures for government-funded research. First, we might have a general research agency, which contains subdivisions addressing the research needs of various other governmental tasks: 

Alternatively, each government department might have its own research agency: 

Almost everywhere, we see organizations more like the second structure than the first---there might well be some countries in the world where research is so small or so controlled that is it organized in the first way, but if so, I am not aware of them. Why might that be?

Consider what happens if you are a leader in the department of agriculture, and you want to expand your agency's research work. Unless strong regulation prevents you from doing so, it's much easier to create or expand a research organization within the agriculture department than it is to get an independent research department to do it for you. A research sub-department within agriculture is also more likely to serve the peculiar needs, time scale, market structure, etc. as relates to agriculture. It's also easier and more rewarding to go to government leadership and fight to get resources for your own organization, where you can explain exactly how you plan to utilize them, than to fight to give them to somebody else.

Since both government structure and research needs evolve over time, we may thus expect research organizations to multiply, both across the government as a whole and also within individual sub-organizations. They are, in fact, occasionally reorganized and combined with the goal of making them simpler and more efficient to interact with, just as other government agencies are, but that will typically not reduce the number down to one, just to a smaller "many." Moreover, we've only discussed government funding, not industry funding or funding by foundations and NGOs, which all have their own separate needs and desires and further complicate the funding landscape.

Now, to the second aspect of the question: why is there no central database for applications? Sometimes there are, at least partially. For example, in the United States all government requests for proposals go through FedBizOpps. Most research solicitations can thus be found there (though not all, due to the diversity of mechanisms), along with requests for things like security guards for the US Embassy in Costa Rica. As you might guess, however, the sheer breadth means this often isn't a terribly efficient method of searching.

Likewise, every agency has different sorts of information it's looking for in research proposals. Again, taking the US as an example, the NSF really wants to know how its funds will support graduate student and postdoc education, since that's a key part of its mandate. AFRL, on the other hand, usually doesn't care much about supporting students, and has a mandate instead focusing on how its funds will affect current military concerns. As a result, a "universal" proposal would likely be quite cumbersome even if the bureaucracies were somehow reconciled.

Bottom line: "research" is too complex and pervasive a set of needs to readily stay contained within a single unified organization.

Sunday, December 06, 2015

A Publication Sea Change?

Now, in the closing of the year, is a time to start taking stock of my scientific progress over the last twelve months.  What in my professional life is going well, what things need care and focus, what things have changed on me a bit at a time without me noticing all of the accumulation?

As I've been looking back over my publications of the last year, I've noticed something that is unexpected: apparently, 2015 is the year my publications moved to journals.  During graduate school, I barely published in journals at all, being both a less mature writer and in the deepest depths of computer science workshop/conference culture.  By the time I graduated and came to BBN, my work was starting to mature and I began putting extended versions of my conference papers in journals, to a tune of about two journal papers per year, still only a fraction of my scholastic output.

This year is is different: this year I have twelve journal articles stamped with an official publication date of 2015 and three more that have appeared in "online early" editions.

In large part, this reflects the growth of the synthetic biology side of my research.  Synthetic biologists typically publish in journals rather than conferences, and so what might have been conference publications in computer science go to journals instead in synthetic biology.  I've been working seriously in synthetic biology for several years now, but my collaborations have been growing and maturing, and some of those articles reflect projects multiple years in the works that have had long and hard roads to publication.  There are also some that in practice were published online last year, but have only this year been officially assigned to a theoretical paper issue that no-one really reads that way any more.

Five of my publications, however, are from the aggregate programming / spatial computing side of my world, and that also reflects a major increase in activity.  Here, though, the time to publication is often a very much longer road indeed.  I have noticed that computer science journals are often much more comfortable with lengthy times in review and revision than biology journals are.  Once of my articles that's just come out, for example, was first submitted in February of 2014; another article was submitted in March of 2014 and should appear in mid-2016.  I think that this may be because of the conference culture in computer science: a journal can afford to be quite slow and dozy in review because the editors assume that everyone has already got access to the prior version of the paper, and the journal issue will simply be the extended remix.  I do not know if that's the case, but my experience has certainly showed a stark difference in the urgency that attends each culture's publications.

In fact, then, my surge of journal publications does not actually reflect a surge of writing in this year, but rather a more gradual increase over the last few years, first picking up on my computer science side, then rising on the biology side as well.  The slower computer science and faster biology waves then happen to coincide in 2015, creating this prominent spike in journal publications.

In fact, the rate at which I am writing publications does not appear to have changed all that much.  The actual numbers of publications that I am an author on that have been initiated in these last few years is:

  • 2012: 17 publications initiated
  • 2013: 17 publications initiated
  • 2014: 16 publications initiated
  • 2015: 20 publications initiated

The quality and intensity of those publications has risen, though, as has the degree of collaboration, which also no doubt leads to more publications per unit effort on my part.

So, what does this all mean?  In short: this means that I seem to be saying things that others are interested in scientifically, and working with more people to say more things more clearly, and overall I think that's good.

Saturday, December 05, 2015

SBOL Visual

We have just published what I believe is a very important paper, "SBOL Visual: A Graphical Language for Genetic Designs."  You can read it online for free, from PLOS Biology.  This is the culmination of a long and slow process (as most standards work tends to be) of looking at the different ways that people make diagrams explaining genetic designs and trying to boil it all down into a simple common language for communicating.

Diagram languages for communicating designs are practically universal, in any area of human endeavor where we need to talk about making complicated things.  Whether you're building electronic circuits or houses, writing software or maintaining a sewer system, sewing from patterns or folding origami, there are standard ways of drawing diagrams in order to communicate ideas and minimize confusion.  So of course we need them for engineering biological organisms as well, and thus, SBOL Visual.

The basic idea is quite simple, and can be captured in a simple image of genetic constructs organized along a DNA or RNA sequence "backbone":

The current set of icons covers a lot of the constructs that people engineer, though by no means all.  If you've got another thing to put on a diagram, you can use any icon you want, as long as it doesn't conflict with an existing SBOL Visual icon. To let SBOL Visual expand and become more universal, however, there's an open community process for adding more icons, with a number of icons slowly working their way through the process.
Current SBOL visual icons
Finally, although we provide "standard" icons, there is actually a great deal of flexibility in how you can style them, which makes it easy to use these icons in anything from scribbling on a whiteboard to computerized design software to figures in scientific publications.
All of these diagrams follow the SBOL visual standard.
In some ways, this is a very simple thing.  It is, however, extremely important to get these simple thing right, in order to reduce the amount of friction, frustration, and mistakes we make when we work together and communicate about the things we do.  SBOL Visual is an important step in getting that "less wrong" in the engineering of biology, and I'm glad that it's officially become published now as well.

Tuesday, November 24, 2015

Paying down organizational debt

When I first learned about the concept of technical debt (from a post on the excellent Coding Horror blog), it was like a light going on in a dark room where I had been tripping on things for years. Technical debt (also known as coding debt, when considering software) is a way of thinking about the cost of shortcuts.  When you are putting together a project and you need to get stuff to work now, we often choose an approach that's faster and easier to implement, but which we know isn't really the right way to do it.  That choice creates a piece of technical debt---an approximation, inelegance, or minor incorrectness.  When we do something else in the future, it may be harder because of the technical debt that we have created, either because we have to work around the poor implementation, because we have to fix the poor implementation, or simply because the poor implementation is confusing.

Technical debt is not necessarily bad, any more than financial debt is necessarily bad.  Taking on technical debt is a way of accomplishing something that you might not have been able to accomplish if you had to do everything "right" the first time.  It's also a way of deferring difficult decisions until we better understand which path is correct, in order to avoid doing something we think is "right" but that later turns out to have been wrong.  Like financial debt, however, technical debt can accumulate and can also lead to taking on additional technical debt, creating all sorts of havoc.  What's important is to keep track of your technical debt and to try to make wise decisions about when to allow it to accumulate and when to pay it off.

Lately, I've been realizing that this same idea can be extended more generally to organization of one's effort on across different projects.  As a scientist who leads my own investigative ventures, I have quite a number of different projects that are "live" at any given point, ranging in scope from an hour or two of effort here and there (e.g., service as an associate journal editor) to complex multi-year ventures (e.g., development of the aggregate programming stack).  When I make choices in managing these projects and my time between them, I often take on organizational debt.  This accumulates in lots of different ways, such as note cards accumulating on my desk, file directories growing large, open browser tabs, email messages on my "need to reply" list, etc.

Just like taking technical debt on in writing software, there are pluses and minuses in taking on organizational debt. If I spent lots of time trying to be hyper-organized and so that I avoid taking on organizational debt, then I will be much slower at actually accomplishing the things I'm trying to organize.  If my organizational debt causes me to overlook a deadline or to end up in a last-minute crisis trying to get things done, however, then my work and my life suffer in various ways. Some of these costs are quite signifiant and spill out of work into costs on the rest of my life, such as losing time with family and friends, lack of sleep, getting sick, and gaining weight.

My struggle of late, then, has been to recognize that paying down organizational debt is a real and legitimate part of my job, just as paying down technical debt is a real and legitimate task in executing particular projects in my job.  I wouldn't say that I've found a clear method of managing my organizational debt yet, but, as they say, the first step to solving a problem is to recognize clearly.  I have, however, made some steps forward that have helped immensely.

For example, I am now both publishing papers frequently and traveling frequently.  Both of those have major time lags involved.  For example, in publishing a paper I first submit a manuscript, then get reviews, send a revision, repeat until accepted or rejected, wait for publication, post and publicize; along the way I also need to obtain release permissions internally and sometimes also from funders.  This process can take years, and if I've got a bunch of papers in flight it's easy to lose track of a deadline and create a failure or crisis where none was needed.  Travel is similar: registration, booking flights, hotels, cars, sending in pre-expenses, actually taking the trip, waiting for expenses to register in the reimbursement system, submitting requests for reimbursement, and actually getting reimbursed often spans many months.  To address these problems, I've created a spreadsheet for each task which lets me have a "dashboard" view of what's going on overall with regards to that area of responsibility.  For example, here is part of the 2015 sheet from my publications spreadsheet:
Publications not yet available have their names tastefully redacted.
It's not a panacea.  Nothing is, for me, when it comes to technical or organizational debt, because I'm not willing to pay the additional cost of not taking on debt.  Moreover, I'm sure that my approaches will have to change periodically, as my career continues to evolve.  Solutions like these help, however, and recognizing the importance of (at least sometimes) tracking these moving targets is, I think, a useful step forward towards more improvements in my quality of life, both at work and also at home.

Sunday, November 15, 2015

Signs in the Snow

When I fly for business reasons, I always try to get a window seat.  I have always delighted in the view from the airplane window, at all the shifted perspectives one obtains when looking at the world on high.  As long as I still love these sights, I feel, I know that I have not become too jaded, and my sometimes-strained soul has not yet died (I speak in the metaphorical sense, of course). When I fly with family, it's different now: Harriet almost always claims the window, and despite my longing to displace her, I would rather share the joy than steal it.  When I am alone, however, even on my shortest and busiest flights, I will track the sights at least occasionally, and I often shutterbug my way through takeoff and landing, so happy that our minor electronics are once again officially allowed to be active at those time.

One of the things that always fascinates me most is the way the landscape radically transforms with seasons.  Moreover, despite what one might think, I find that it is winter that most brings out the texture of the land.  When snow is on the ground, its topography is highlighted, leaping out in dark lines on every vertical and slope.  With those thoughts in mind, I present to you dear readers, an album of interesting forms I've seen, which I think of by the title "Signs in the Snow":

Album: Signs in the Snow

Thursday, November 12, 2015

Explaining tornadoes to a preschooler

I've had a bit of a nervous evening tonight, thanks to one of Iowa's signature weather events.  I knew there was going to be rain today, but was unprepared for what to do when my phone unexpectedly buzzed with a county emergency alert as I drove to pick Harriet up from school.  I looked down and saw a tornado warning and a small piece of my brain flipped out: "What am I supposed to do if I'm in a car?" "Should I try to pick Harriet up from school or not?" "What happens if I can't pick her up on time?" 

At the next light, as the rain suddenly picked up and began to pound furiously against my windshield, I scrabbled to search on my phone and was rewarded with the official word from NOAA: "There is no safe option when caught in a tornado in a car, just slightly less-dangerous ones."  I scanned channels on the radio and found that while most were doing business as usual, the local country music station had switched over to nothing but live storm reporting.  I cranked the volume to be able to hear it over the rain and inched forward into the pitch black of dark storm and early nightfall, barely able to see the cars packed all around me, trying to figure out whether I was doing the right thing or the wrong thing.

Since moving to Iowa two years ago, I've had two previous encounters with tornado alerts. The first time, the sirens went off while all three of us were at home.  I was actually on a Skype call with colleagues back in Boston at the time, working on a paper, and I think that they were more freaked out than I as we relocated down into the basement for the duration.  The second time, I was in the air, returning from Boston along with Harriet and Ananya, and the plane made three attempts to land in heavy, free-falling turbulence before giving up and depositing us in Peoria, Illinois.  That was a very, very long day.

Today, I splashed through flooding streets and eventually found a big hand-written sign on the school door telling parents to come find after-school down in the basement.  Parents, children, and teachers were all packed in together, and the slightly nervous buzz amongst the adults helped stimulate the children to a high degree of frantic energy, not really understanding why today was different but knowing that it was so.  

After 20 minutes or so, the storms had passed and we began to disperse, but the time had left its imprint on our evening, especially since Harriet wanted to dawdle and hang out at the school and I was very eager to get out of the weather and home, not knowing how long the gap might last.  All the way home, Harriet and I talked about thunderstorms and tornadoes, and I tried to tread a careful line between ensuring that Harriet understood that tornado watches are serious business but also not making her frightened.  Partway there, she had an excellent inspiration and asked me to find a song that would make tornadoes and thunderstorms go away, and I doggerelled up the following:
Tornado, tornado, don't come around our house!
Tornado, tornado, just go away you louse!
Real peace of mind for Harriet, though, didn't come until we were sitting down at the dinner table and got the internet involved.  First, we went searching for other tornado songs, finding some excellent broken-heart-country kinda stuff and a couple of wacky amateur things that Harriet liked even better. As she got into it and began to relax, I offered to show her how we were kept safe from tornadoes, and so we ended up looking at the NOAA radar:
Tonight's line of thunderstorms, observed from home after they passed us by.
The colors help, that's for certain, and I think she really got it, since she started telling me things like "We want to be in the blue and green, not the red."  I showed pictures of doppler radar stations like the one in Davenport and talked about how the people there had really good computers and were always watching to make sure they knew when tornadoes might appear, that could tell us if trouble was coming 15 minutes ahead of time, and that my phone would buzz and let us know if we needed to go to the basement.  We navigated all around the region, looking at different radars (Harriet expressed much sympathy for Chicago, which was going to get hit next), and looked at satellite images from space as well as pictures of weather satellites.

And then Harriet asked to look at the weather radar for Madagascar, which she's been into lately, and we found there wasn't one, or at least not one whose images are posted online.  That made me really think about the infrastructure involved in having weather safety systems like we have, and the powerful role of government and civil society, not just in keeping me physically safe, but also in providing me with peace of mind, enough even to pass it honestly to my three year old daughter.  I did not have to tell any comforting lies tonight, or hope and pray for safety: I know down deep in my heart that men and women are doing simple professional work to establish this remarkable network of protection across the land.  My eyes teared up a little, as they always do when I contemplate the non-glamourous wonders of civic society, and I know why I believe in the importance of good government.

Saturday, November 07, 2015

Jack Splat!

Last night the Iowa Children's Museum presented us with a wonderful surprise. We'd made a plan to go there for a Friday evening outing after school, and we walked unexpecting into the remarkable event known as Jack Splat! In the big "main street" open space at the heart of the museum, where normally one could access the music room and the post office, tarps were laid down and a cleanup crew stood below. Above on the balcony lurked a great swarm of over-aged pumpkins, a week past Halloween and ready to meet their end.

Our timing was perfect: Isaac Newton was just explaining to a rambunctious audience how his first law of motion meant that a falling pumpkin, once in motion, would keep moving until it encountered an opposing force---"The ground!" cried the children, and the pumpkin flew down to meet its fate.
Isaac Newton (resplendent in toilet-paper-roll wig) and lab assistant preparing to launch a pumpkin.
The physics lessons continued, accompanied by redolent meaty splats of demonstration. Before each pumpkin flew, the throwers read out its name and the name of its donor, as well as the specified method of execution (e.g., roll from the ledge face first, backflip up in the air).  The children cheered and chanted (though one little boy near us was quite upset, and asked why the people hated pumpkins so much), and the rain of gourds continued for nearly half an hour, quite challenging the cleanup crew to keep up with all the mess.
Getting ready for the next bombardment.
I much enjoyed the unexpected show, and was reminded of a similar but much smaller scale yearly event arranged by the undergraduates at MIT.  A good and rather cathartic end to a Friday evening.

Friday, November 06, 2015

The Power of Ignorance

When my friend and colleague called herself clueless this afternoon, it triggered something in my mind.  We've been working on a paper together, and since I'd taken the first pass I'd put everything in LaTeX, hoping to avoid having to figure out how to manage citations and such in Microsoft Word.  I hate writing in Word, and as a computer scientist I usually get to avoid it, opting instead for the nitpicking and precision control that LaTeX offers me. When I am collaborating with biologists, however, LaTeX might as well be Martian or Haskell to many of them, and we often default back to Word.  Selfishly, I'd avoided that, and just assumed I'd end up taking feedback notes or snippets of text and incorporating them myself.

But I had aimed too low, and here my friend surprised me.  Rather than complain or take the easy route, she asked me to put the document into Overleaf, an online LaTeX interface that she'd just learned about, and she did her editing there, including using the LaTeX todo notes package that I've been using for tracking commentary in the document.

And then she called herself a "clueless biologist," as she asked me questions about this fearsomely complex new technology she's been voluntarily educating herself on.

Those two words say something that is very important and also sad about the way that science often operates, and I think about our larger society as well.  My friend is largely ignorant about LaTeX, in the sense that she lacks knowledge, but "clueless" is a rather negative view of ignorance.  Adding "biologist" lumps her into a category, othering her and tying that to this "clueless" expectation---a stereotype that I'm quite familiar with.  I don't like it, though, because that phrase sets up an expectation of an "us vs. them," Men are From Mars/Women are from Venus, Computer Scientists are from LISP / Biologists are from S. cerevisiae sort of oppositional dichotomy.  That makes us all smaller, I feel, because it divides us and suggests our minds are alien to one another, and therefore that we ought not to attempt to learn from each other so much.

Instead, I think that we should be celebrating and embracing the power of our ignorance.

By this, I do not mean that we should avoid knowledge.  Knowledge is wonderful and empowering, and recognizing one's ignorance is a first step to doing something interesting involving others who are not ignorant in that same area. The Renaissance Man is dead---quite dead---and Joy's law is how we spend our lives, in a world where there are so many interesting things to know about, some important and some just fun for somebody.  We are all more ignorant than we can know, and not just in a snotty Socrates one-upmanship way of looking at it.

One of the hardest things I had to learn while I was a graduate student was how to say, "I don't understand" and "I don't know."  I learned it from Hal Abelson, the professor who always asked me the hardest simple questions I have ever heard.  Hal taught me that saying "I don't understand" did not have to be an admission of weakness.  It could simply be the truth, and then what happens next depends on why you don't know.  When I was talking to Hal, it was usually the case that Hal saying "I don't understand" was an indicator of some fundamental flawed or overlooked assumption in the thing that I was trying to explain to him.  I learned a lot about my own research from Hal's admissions of ignorance, and I also learned to stop being afraid of lacking knowledge.

I'm still learning that. It's easy in our competitive world to fear that admitting ignorance is the first step to losing out to other people who are better at putting up a front, and I still struggle with that. But fearing ignorance is almost as bad as being proud of it, and I prefer to avoid it when I'm not too panicked to consider the bigger picture that I'm living in.

The fear of ignorance is competition, but the power of ignorance is partnership and teamwork.  I'd much rather live in the second world, and I hope that I can be sufficiently wise to help encourage it for both myself and my compatriots.

Monday, November 02, 2015

Whole lotta ruttin' going on

That's what the highway sign said this morning:
It brought a smile to my face even as I was duly warned of the frightful danger that I face on the roads of Iowa.  This is one of the most intense states in the nation, when it comes to collision risk: currently third, with a 1 in 68 chance of hitting a deer each year.  Frankly, this number really blows my mind, especially coming from Massachusetts where the odds are an order of magnitude lower, and Boston in particular where you might as well not even bother thinking about the possibility.  I've had a recent close call of my own already, a couple weeks back on a date with my wife, when a deer darted across in front of us and I had to jam my breaks on to avoid an accident.

In Iowa, the distinction between countryside and city is much sharper and closer than New England, and it seems to me that it's a virtually ideal environment for deer.  Deer flourish on the edges of forests and in sparse woodlands, and those are things that Iowa has in great amount.  Drive through the countryside, and you see something that I have never really known before I moved out there into the Midwest: an entirely rural-industrial environment.

Growing up in New England, and with my father's stories of growing up in Colorado, I'm used to the idea of "rural" meaning lands where little or no people live.  Walk through the woods of Maine or New Hampshire or even Massachusetts, and you will first of all find that you've a very hard time walking through those dense pine woods at all, and second that they often extend in all directions for many miles, a fearsome wilderness broken only by the old stone walls of centuries-abandoned farms. The woods of Iowa, by contrast, are linear affairs, in which one would have a rather difficult time getting lost at all.  They curve along contours of land, following streams and rivers, in those narrow places where the land is too steep to be productively arable.  Elsewhere, the land is mostly farms, broken by an apparently arbitrary fractal dispersion of cities, towns, and factories.  Suburbs don't exist they way I knew them in New England, where the city slowly peters out into nothingness over the course of many miles: Iowa City stops about a mile East of our house on a straight-edged line, an instant transition from extremely dense developments to fields of corn and soy (rotating on a 3-year cycle).  Even in the most rural areas I've been, you're never more than a couple of miles from a dense aggregation of a few hundred people clustered together in a tight little town.

Biodiversity is low, in an environment like this, but species that do well on the boundaries with humanity, like deer and rabbits, flourish and expand.  All the halloween pumpkins in our neighborhood are attacked and eaten by marauding squirrels.  And this transplanted specimen still feels for roots, sorting out my place in this rich Midwestern soil.

Monday, October 26, 2015

An unknowing inheritance: BBN's stop and go history in genetic engineering

When I joined BBN back in 2008, one of the new things I brought with me to the company was my research in synthetic biology.  I was a starry-eyed and naive recent Ph.D. graduate, and it was one of my little exploratory sidelines, which would not expand into a full-scale line of research for another two years, blooming as my AI work slowly withered away into neglect.  Nobody at BBN was even thinking about synthetic biology at the time, and nobody I was working with had any institutional memory of such research being done at BBN before, and so I assumed that must be so, and indeed for most intents and purposes it was so.

In fact, however, BBN has been a significant player in the work of genetic engineering at least twice before that I now know about.  One of those times I learned about several years ago, and is not the brightest of episodes for the community.  The other, however, I only learned about a short time ago, as I prepared to give a talk on the new SBOL 2.0 standard for encoding genetic designs, and it makes me both proud of my institution's history and amazed that it has somehow dropped out of its memory as an institution.

The nearer and less proud episode was BioSPICE, and it haunts my every step as a non-lab-centric researcher.  As best I understand the project (I was still a grad student chasing strong AI), in the early and heady days of the word "synthetic biology," a bunch of the leading researchers in the field made a try on the big goal of predictable simulation and engineering of organisms, at that point thinking they already had a sufficient critical mass of good tools and knowledge to take something like a straight-up electrical-engineering-style approach to the problem.  Thus, BioSPICE, a big DARPA-funded project to try to build the equivalent of the SPICE tool for simulation and engineering of electronics, which started with much fanfare in the early 2000s and much more quietly folded up shop a few years later.  I had known about some of the academic side (quite distantly at the time), but only later came to learn that BBN had been significantly involved in some way---I'm still not quite sure how, since the few stories I've gathered don't seem to correspond to what searching online turns up.  Years later on, when I would open my mouth to talk about the promise of model-driven design, it was often BioSPICE that dogged my heels, and fueled cries of, "We know that doesn't work, just remember BioSPICE!"  I have fought that history hard, all the way to the last few years when we've been finally able to start producing evidence that we really can predict biological circuits from their component parts.

The other episode of BBN's involvement with genetic engineering is much older, quite long before my time.  Back in 1982, when I was only four years old and the genetic engineering revolution not so much older, BBN began development of GenBank, probably the most important repository of biological information in the world.  What is it, and why is it so important?  GenBank stores genetic sequence information: it's where pretty much all the important scientific information about genes and genomes gets stored, one way of another.  BBN, with subcontracting help from another apparently unlikely collaborator, Los Alamos National Laboratory, put it together and ran it for its first few years of existence, as it became established and started gathering information.  Eventually, as it became less a research project in and of itself and its contents became more and more important, it moved to curation by the NIH, who still manage it to this day, as an exponentially growing resource made publicly available to all of humanity.  Perhaps it's not quite as big a deal to work on as the internet or email, but pretty close, in my books.

Somehow, though, we seem to have almost entirely forgotten this history, as a company,  It's not trumpeted on the list of accomplishments on our front page, nor bragged about in the "history of BBN" materials that people pass around. The only name I've been able to find so far who was associated with the project from BBN is Howard Bilofsky, who apparently spent 17 years at BBN before leaving in 1990 for a long, distinguished, and apparently ongoing career in the biotech industry.  Someday, I would love to look him up and learn a bit more about the hidden corners of our corporate history.

GenkBan and BioSpice, triumph and failure.  And now a third wave of biology at BBN, with me, trying to navigate these waters once again, as best my limited scientific sight can guide me.

Friday, October 23, 2015

Is academia really just a huge competition?

Another question that really made me think was posed last night on the Academia site of StackExchange, and once again I'd like to share my answer with you, my dear readers.

The question was simple in its essense, yet deep and rather challenging:
Is academia really just a huge competition?
I started writing an answer several times before I finally ended up with a direction that I could really believe in what I was saying.  The result was this statement, that I think reflects some difficult passages of my own over the years, back and forth along the tension between cooperation and competition:
You've asked a question that is both very important and very difficult, as well as one that is likely to draw different answers from different people depending on their own experiences in academia. 
This is because there are both competitive and cooperative aspects to academia. Different people take different strategies with respect to the balance between these two, and that affects their communities as well, so that the mixture of competition and cooperation that you encounter will also radically differ between different academic communities. 
Some of the key factors for inducing cooperation are: 
  • Science is hard.
  • Working together, people can accomplish things that they cannot possibly accomplish alone.
  • Cooperation in a team gives you an advantage when competing with other teams.
  • Many people enjoy working together in teams, and this is just as true for science as it is for any other human endeavor.
  • Scientific discovery feels awesome and it can be really fun to share that feeling with other people.
Some of the key factors for inducing competition are: 
  • Inherent conflict of ideas: when theories compete, people often become polarized and begin competing based on the "team" they support intellectually.
  • Limited resources: you've got a good idea, but a lot of other people have good ideas too, and there is not enough funding to support all of them fully: some people will not get what they want. Likewise, the Hubble space telescope can only point at one thing at a time, and there are a lot more things people want to point at than time to point at them.
  • Explicit competition set up by external agencies. For example, DARPA will sometimes make scientists in the same program compete with one another, and the loser gets their funding cut off.
  • Many people are just plain competitive, and want to "win" over other people in various different ways, and this is just as true for science as it is for any other human endeavor.
Bottom line: just like everything else, academia can be a competition, and everyone faces some aspects of a competition. But it's not just a competition, and I feel sad for anyone who experiences it in that manner. 

Sunday, October 18, 2015

Racism, fond memories, and toddler education

As I was reading Harriet her bedtime stories tonight, I was struck once again by a thing that greatly pains me.  Many of my fondest childhood memories are laced with rather awful racism that I simply failed to be aware of.  Case in point, tonight one of the books we read was To Think that I Saw it on Mulberry Street. This book is a simple and delightful Dr. Seuss tale of a child's fantasies of what he saw while walking home, building from a simple horse and wagon to a fantastical parade.  And there, on the second to last page, is this:
"A Chinese man who eats with sticks" --Dr. Seuss
Apparently, Dr. Seuss thought that Chinese-Americans were just as unusual a freak-show as a man with a 10-foot beard, a magician pulling piles of rabbits from a hat, and two giraffes and an elephant towing a brass band down the street.  And so we get this image, on which I can count at least six blatant pieces of racism.  Worse yet, this is apparently the post-1978 revised edition in which the racism is toned way down: he's a "Chinese man" rather than "Chinaman" and he's no longer wearing a pigtail and painted bright yellow.

OK, I know that Dr. Seuss is well known to have done some awfully racist things over the years (e.g., this cartoon condemning Japanese-Americans during World War II).  I know this.  But it burns me up that I had no idea that this monstrosity was living inside a favorite childhood book.  In other words, it's not that Dr. Seuss was making racist drawings, but that I didn't remember the racism at all. We bought this book (well, I bought this book) for Harriet quite early on, on the strength of my fond memories, and I was shocked when I got to this point.  I also noticed that the police were Irish and was a little bit dubious about the Rajah riding the elephant.  Not being familiar enough with the subject matter, I wasn't sure if the Rajah was racist or just archaic (like a knight in shining armor or a lady in a wimple), so I asked my wife, who is South Asian.  Her answer? "Totally racist."

This leaves me with two dilemmas that I struggle with.  First, what does this say about me, to not have known I had such racism in my education?  Clearly there's at least a bit of "fish don't have a word for water" going on.  I did not have this racism called out to me, and thus I didn't realize that it was anything to notice.  It's there in many other things I loved as well, like If I Ran the Zoo (another Seuss), The Jungle Book, and Tintin (oh my goodness, Tintin).  I loved these things and, if I am honest with myself, still do.  My favorite Jungle Book story of all time is "Kaa's hunting," and now I cannot read its descriptions of the Bandar-Log monkeys without wondering if they are allegorical for Kipling's views of India.  Tintin in America is practically hallucinogenic in its kaleidoscope of stereotypes and disrespect for, well, everything, and I still would read it again if I had a copy here in front of me.

And that leads me to the second struggle: do I share these things with Harriet or do I censor them? Mostly, there's an obvious third path that avoids the issue: there are so many good things out there, that I can simply choose to select the ones that I find less problematic.  But what about the ones I find out afterward, like in Mulberry Street? Tonight, I didn't read the line.  I broke the rhyme and went straight to the big magician doing tricks.  Other times, I read it through.  Sometimes, I point things out to her and critique them ("this picture is being mean"), and sometimes I do not.  Mostly, I am uncomfortable and simply shift my strategies back and forth.  I find some of the advice out there about liking problematic media to be useful, but it's not the end of the story and I still have not found peace.

Saturday, October 17, 2015

SBOL 2.0, governance, and Jake's self-perception

This past summer, one of the most significant scientific milestones I've been involved with is the publication of the SBOL 2.0 standard for representation of biological designs.  What it's all about is being able to better describe and exchange information about the genetic constructs and similar such systems that people are trying to build.  Perhaps the best way to describe it is with this diagram I prepared for a talk, comparing SBOL 2.0 to previous standards:

FASTA is about as bare-bones as it comes: pretty much just listing out the DNA sequence that you want.  GenBank lets you annotate that sequence with descriptive information about what the different parts mean, and SBOL 1.0 lets you describe the structure of a design hierarchically in terms of annotated sequences that get combined together as "parts" to make bigger designs.  SBOL 2.0 lets you talk about function as well, describing the way that these parts interact with one another to create the overall behavior of a design.

Conceptually, it's fairly simple, but in practice it took several years to work out and the arguments are not yet over.  The document that we produced is more than 80 pages long, and we're still tinkering with bits and pieces as we try to understand all of the consequences of what we've built.

SBOL is heavy on my brain right now because for the past week, I've been at the COMBINE meeting, where the communities for SBOL and a number of other biological standards meet up to try to improve their systems, work on interoperability, etc.

This is still not something I ever thought I would be doing with my life.  Even now, in my prejudicial mind, standards design is still something done by grey little people who care passionately about trivial and boring things.  I struggle with this, because I look at my work in this area and simultaneously feel that it is highly important and mind-numbingly stultifying to anybody who isn't actually in the room arguing passionately about the potential long-term consequences of adding a single arrow to a diagram.

A case in point: one of the things that I'm most proud of this week was the updated governance document I drafted, and my mediation of discussion on this document, which helped tune it to become widely accepted; the updated version now appears well on its way to official approval by a formal community vote.  So, apparently I am proud of work I've done on adjusting the methods for making decisions regarding an experimental standard for interchange of information about biological designs that will allow faster prototyping of improved systems for biomedicine, biomanufacturing, etc.  That's at least five levels of separation from anything that really affects the larger world. Looked at in that light, this is clearly the very definition of obscurity.  And yet, let me spell it out in another way...
  • Good governance, which gets openness, power, and decision-making right, is critically important for the health of a community, and a number of little warning signs have indicated that the SBOL community needed to adjust its governance to match the way the group has developed and grown.
  • If the SBOL community governs itself effectively, then it will make better decisions that are more likely to lead to a useful and effective standard.
  • If the SBOL standard works well, it will make it a lot easier for people to develop good biological engineering tools.
  • Those biological engineering tools will make it a lot easier to safely and predictably engineer with and for living organisms.
  • Used responsibly, those capabilities can help make all of humanity healthier and safer, as well as improving our ability to manage our environmental impact on a global scale.

This nail I've driven in is very small and unimportant, almost certainly, and yet it matters.  It matters a lot, and not at all, all at the same time.  And I suppose that's just the way the world works, on a planet with seven billion interconnected and increasingly technologically powerful individuals.  Our civilization is remarkably strange and obscure in its operation, and I'm glad when I find satisfaction in the parts I play.

Thursday, October 08, 2015

Publication delays ARE aimed at manipulating impact factor!

A few months ago, I wrote a post with a question: Are publication delays aimed at manipulating impact factor?

Today, I have an answer to that question: yes.

A recently published article, "Editors’ JIF-boosting stratagems – Which are appropriate and which not?" (h/t RetractionWatch) investigates strategies that journal have been using to boost their impact factor and explicitly calls out what it calls the "online queue strategem."  The article is paywalled, so let me summarize here.  In addition to reviewing some of the better-known and clearly unethical practices used by some journals (e.g., forcing citations on authors, citation cartels), the paper carefully dissects the effects of having a long "online early" period of publication, finding four main effects:

  • Papers accumulate citations before "official" publication (multiplying by ~1.5 to 2)
  • Citation rates typically peak 3-4 years after publication, so shifting the time selects for a better citation date (adding another ~50%)
  • Queue order can be manipulated to publish the papers picking up the most citations earlier, (adding another ~30%)
  • Calendar-year boundaries mean that papers in early months count more than papers in later months, so strategic organization of early-month issues can further boost citations (adding another ~30%).

All of this adds up to around 5-fold potential distortion in impact factor.  Since the dynamic range of most journals is only around 0.5 to 10 anyway and even the very highest impact factor journals top out at ~50, this renders that most precious number completely useless.

Now, it's possible that many journals aren't deliberately and strategically manipulating their queues, meaning they'll only get about a 2x boost in impact factor from queuing.  So what?  It still means that impact factor is going to be highly distorted and basically only good for distinguishing journals into three categories: "glamour journal", "normal journal", and "ignored journal" (less than about 0.3).

Ironically, the article itself is dated February, 2016.

That's it: it's clearly time to adopt the wise strategy of my favorite satire journal, the Proceedings of the Natural Institute of Science.  Their current impact factor? "Leadership"

Wednesday, October 07, 2015

Scale-free distribution of payoffs in science

One of the things I've been enjoying these days has been answering questions on the Academia site on StackExchange.  This question-and-answer site is part of the vast network of Q&A sites that have flowered out of the wildly successful StackOverflow, which is pretty much the best source for coding help on the internet.  The model is that people ask question about the topic, e.g., academia, and other folks turn up and provide answers, and then you get or lose Fake Internet Points depending on whether the crowd thinks it's a good answer.  It's surprisingly effective and also, for me at least, pretty enjoyable and kinda addictive.

Anyway, I answered one this morning that made me think a lot, and I thought that I might share my thoughts here as well.  The question was simple, fundamental, and ill-posed: "What is the distribution of payoffs in research?"  Basically, the person is wondering whether every experiment is a roughly equivalent step forward, or whether some are much more valuable than others, and if so whether there's some sort of power-law relationship between topic, funding, and value of result.

This is ill-posed, because the whole notion of "payoff" is extremely vague and probably the wrong question to ask, but it really made me think.  My response, which I'd like to share with you, was this:

There's a vast amount of ill-definition and uncertainty wrapped up in your question... and yet despite that, the answer is almost certainly yes, there is a power-law distribution.
I'm going out on a limb a bit here, because I'm not building on any published analysis that I'm aware of. However, a little analysis of limit cases and fundamental principles can take us a long way here. Let us start with two simple and relatively uncontroversial statements:
  1. Better experimental design leads to better results. It seems self-evident that if you make a bad choice in designing and experiment, it's not going to get you the interesting results you want. At the micro-scale, some choices are clearly better than others, and some are clearly worse.
  2. Sub-fields appear, expand, shrink, and die. As I write this, CRISPR research is hot, and a lot of people are finding interesting results there, and accordingly that field is rapidly expanding. Nobody is doing research on the luminiferous aether because it's been discredited as an idea. Nobody is trying to prove that it's possible to generate machine code from high-level specifications because Grace Hopper did that in the 1950s, when she invented the compiler, thereby initiating what is now a fairly mature and stable research area.
So clearly, no matter how one defines "payoff," any sane definition will see a highly uneven distribution of payoffs both the micro-scale of individual experiments and at the fairly macro level of sub-fields.
Finally, we need to recognize that "significance" is a matter not only of objective value, but also of communication through human social networks. This means that the same result may have wildly different impacts depending on the methods and circumstances of its communication. The history of multiple discoveries in science is ample evidence of this fact; one nice illustrative example is the way in which Barbara McClintock's work on gene regulation was largely ignored until its later rediscovery by Jacob & Monod.
So, we have variation and we have interaction with human social networks, which tend to be rife with heavy-tailed distributions. All of this says to me that it would be remarkable if there were notsome sort of power-law distribution regarding pretty much any plausible of definition of impact, significance, and investment. For these same reasons, I think it would also be surprising if one can make any more than weak predictions using this information (e.g., "luminiferous aether research is unlikely to be productive", "CRISPR is pretty hot right now").
And the devil, of course, is in the details...

Saturday, October 03, 2015

Tribute to driving in Germany

I'm sitting in the Frankfurt airport right now, having just finished driving two hours on the Autobahn up from Schloss Dagstuhl, in Wadern near the French border.  As an American, I'm used to fairly titanic road networks, but driving on the Autobahn feels different to me: my impression is that while Americans use our roads, Germans really love their roads.

Out in the gently winding hills and valleys of Western Germany, forests and fields flash past traffic freely flowing at 100 miles per hour.  The gentle curves are well designed to encourage speed, and many people take good advantage of it.  Yes, it's true, on much of the Autobahn system there is simply no official speed limit (though you'll still get pulled over if the police think you're driving dangerously), and even where there is a limit it usually restricts you only down to 130 kph (a little over 80 mph).

In my little bitty economy rental car, I cruised along comfortably at 110 mph or so in 6th gear (don't even try asking for automatic in Germany), its happy German engineering not making the least complaint about the speed.  At home, my faithful Toyota starts to get very loud and quite unhappy by the time that I reach 85.  Even so, happy-looking people in bigger cars rocketed smoothly past me at significantly higher speed, bound for who knows where at the highest speed available.  And all I know is that my head has got this song on repeat, and I invite you to join along with me and sing:

Why I love iGEM

Last weekend was the annual iGEM jamboree---that is, the International Genetically Engineered Machine (iGEM) Competition. I poured in about 60 hours of my time into the contest over the course of 3.5 days, and by the end I was exhausted, both physically and mentally, but feeling absolutely elated and on top of the world, raring to go for another one in 2016.

What had I seen, and why was I so excited?  Well, iGEM is a magnificent and unique event, a gathering of students from every continent, from high school on up, all driven by a passion for biological engineering and simply overflowing with creativity.  Each team spends the summer working together on a project that they create, and in the fall they come together to have a big party, where everybody gives talks on what they've done and the best few are recognized in front of everybody for their superlative accomplishments.  There's lots of silly things, lots of over-ambitious ideas that don't get too far, and lots of nice little steps and learning by the students.

And in the middle of it all, some damned good science gets done as well.

Last year, I co-founded a new track at iGEM focused on measurement.  Yes, we're back to that again: my obsession with terribly unsexy rulers.  We had some very good teams last year, and this year again there were a bunch of excellent projects in the measurement track.  And this year, one of those projects stood out head and shoulders above all the rest.

The team from William & Mary, a small but long-standing and excellent public college in Virginia, chose to focus on an important but subtle problem: quantification of noise in gene expression.  Building on recent work in the area, they dug into the problem and ended up with a simple and easy to use kit for measuring this noise, then applied it to quantify noise for a few of the most widely used biological components in the iGEM parts registry.  Very deep and very geeky, but it matters a lot.  If we want to have safe and reliable genetic engineering, we need to be able to predict what will happen when we modify an organism, and this strikes right at that heart of that problem by measuring predictability.

But that wasn't all: they also worked with their county school system to develop a curriculum for synthetic biology.  It's magnificent, and you can get a copy for free online.  Inside this 80-page document, you can find 24 age-appropriate activities, from "DNA twizzlers" for 1st gradesr to Monster Genetics a couple years later (fire-breath is a dominant trait, but cyclopses are recessive), building all the way to adult-level work in high-school like PCR amplification of DNA and bioethics analysis.  The interactions with teachers really show, as the lessons are not only pretty but also give clear goals and a materials list and expected cost per student (usually a whole class can be supplied with a just a few dollars of groceries or arts & crafts supplies).  Even more remarkably, teachers have already begun enthusiastically adopting it, both throughout their county and in other states and nations.

The William & Mary team gave clear, understated presentations that simply let their work shine through, and the whole community recognized it, ultimately first giving them a chance to present as a finalist in front of all the thousands at the convention center, and finally awarding them the competition's top prize (along with a bunch of others as well).  This simple yet deep set of work comes from a team whose school doesn't even break the top 100 in US News' ranking for biology, and shows the power of careful and thoughtful work in science.  The Washington Post may have been too confused to even mention them, but their university is quite elated, and its staff took the time to understand and write a clear and accessible article about their project.

To me, all of this is a vindication not just of the work I've put in organizing and promoting measurement at iGEM, but of the entire scientific process.  Good things can come from unexpected places, and sharp minds thinking careful thoughts can be recognized and receive the recognition they deserve.  Yes, there are problems in the scientific world---quite many, in fact---but this is why we must fight to preserve and promote the scientific ideals, and to keep making that world more diverse, more inclusive, and more able to recognize and promote the potential to improve our world and make a difference.  This is why I love iGEM, why I'm proud to be involved and for what part I've had in helping to enable this, and why I'll be back again for more in 2016.

William & Mary, iGEM 2015 winners, with the Measurement Track committee

Wednesday, September 30, 2015

Aggregate Programming!

It's finally out: our IEEE Computer article, "Aggregate Programming for the Internet of Things" (free preprint) is up and online where all can get it for free.  Don't be fooled by the name: this isn't really just about the our increasingly networked possessions  ("the Internet of Things"), it's a much more general paper.  In fact, this is the first place we've really put all the pieces of our last few years' distributed systems research together, into a generally accessible article that clearly introduces a better framework for building distributed systems.  Please allow me to introduce the aggregate programming stack:

Aggregate programming stack, with examples from a crowd safety application.
In computer networks, the OSI model is a "stack" of abstraction layers that separate different aspects of computer communication.  The browser you are reading this on, for example, is probably obtaining it via HTTP at the Application Layer, routed to you via TCP/IP at the Transport and Network Layers, respectively, with the last link sent to you over something like 802.11 or Ethernet handling the Data Link and Physical layers.

Our aggregate programming model takes a similarly layered approach to the problems of designing networked systems, breaking these often extremely complex problems into five layers.  From bottom to top, these are:

  1. Device: this is the collection of actual electronic devices that comprise the system, with their various built-in sensors, actuators, ability to communicate with one another, etc.
  2. Field Calculus: this layer abstracts the devices into a simple (but universal) virtual model that can freely be mapped between an "aggregate" perspective in which the whole system acts like a single unified device, and a "local" perspective of individual device interactions that implement this model.
  3. Resilient Coordination: this layer consists of "building block" algorithms (implemented in field calculus) that provide guarantees that systems will be safe, resilient, and adaptive in various ways.
  4. Developer APIs: useful patterns and combinations of building blocks are then named and collected into application programming interfaces (APIs) that are easier to think about and program with.
  5. Application: Finally, distributed systems can be much more easily constructed, using the APIs just like one would any other single-machine library.
Constructing this stack factors the problems of distributed systems development into separable components, each much simpler than trying to tackle the whole complicated mess at once.  If you just want to build applications, you just need to learn the Developer API layer and work with that, just like web programmers learn about HTML and Javascript.  If you want to work on resilient algorithms, on the other hand, you get involved with the plumbing at the Resilient Coordination layer, and if you want to use the stack on a new device, you implement a copy of the interface required for a Device by an instance of the Field Calculus layer.

I'm very proud of this work, and think it's got a potential to really change the way that people deal with complex computer networks.  For the programmers amongst you, dear readers, I suggest you check out both this paper and our (still somewhat rough) implementation of field calculus in Protelis.

Saturday, September 19, 2015

A Golden Boston Sunset

Dear readers,

Some days, it's just good to be alive, and a thing comes out of nowhere unexpectedly to remind you of that fact.  Today, as my flight was gliding down into its final descent into Boston, the air was almost perfectly clear and the sun was just at that magic moment in its descent where everything begins to be golden and shadows stretch out just enough to give the third dimension of everything an extra bit of special emphasis.  As I gloried in the texture of the light, my camera came out and I snapped away---not blocking myself from an enjoyment of this sight, but finding that the aim to capture gave me extra focus and appreciation for the details.

My dear readers, I wish to share this joy with you, in the form of a few of the best moments of imagery I captured.  May this lighten your day as it has lightened mine.

Monday, September 14, 2015

A Tale of two CRISPRs

Last year, I had my first "glamour journal" publication, as second author of a Nature Methods paper on a new family of CRISPR-based synthetic regulatory devices.  Actually, I had two "glamour" publications---the other was a Nature Biotech paper on the SBOL language for communicating biological designs with 32 authors, the biggest collaborative publication I've been involved in to date.  That's a tale for another post, however---this one's all about CRISPR, CRISPR, CRISPR.

For those who haven't encountered the wonderful hype-storm around CRISPR, the acronym expands to the highly non-mellifluous "clustered regularly interspaced palindromic repeats," which tells you virtually nothing about why it's cool.  The reason it's cool is because one of the things this awkward acronym refers to a protein ("Cas9") that docks with fairly arbitrary "guide RNA" fragments in order to go act on DNA that matches those sequences.

Core CRISPR mechanism: Cas9 protein binds to gRNA, which targets the protein to a matching DNA sequence

Protein design is really hard, but DNA and RNA design has become reasonably straightforward, so CRISPR is an awesome mechanism: it lets us target a (fairly) predictable protein effect to pretty much any piece of DNA that we want.  People have used it for editing DNA, which has previously been done with lots of other mechanisms, but gets much easier with CRISPR (hence the recent controversies you may have seen in the news around human genetic engineering---the changes we can do aren't any different, they're just a lot cheaper, which is a meaningful difference of a different sort).

Our paper last year showed for the first time how to use the CRISPR mechanisms to make potentially large numbers of strong biological logic gates.  This is important because one of the big things that's been holding synthetic biology back is the difficulty in building reliable computation and control systems inside of cells.  We've known for a long time that biological computing is possible, but there's only been a handful of decent computational devices, and no good ways of making more.  Now, within the last few years, there have been several different families that have emerged, including TALE proteins, homolog mining, invertases, and now, with our paper, CRISPR repressors.  Our CRISPR repressors are nice because they can potentially easily generate thousands of high-performance devices and implement all sorts of complex computations, something that nobody currently has a clear approach for with any of the other families.

Diagram of one of our CRISPR repressors: a modified Cas9 protein (blue box) acts as "power supply" for an inverter logic gate implemented by having the gRNA (orange box) regulate a synthetic promoter (blue arrow). The important things to know are 1) the orange box and blue arrow are easy to design and we can make lots of them that don't interfere with one another, and 2) the blue box can potentially power lots of these gates at the same time.

So I was (and still am) very excited about this publication for two reasons: first because I think it's a big step forward scientifically, and second because it's in a big-name venue that lots of people are likely to pay attention to and where it's more likely to have a big impact on scientific practice.

Just last week, I had my second paper in Nature Methods, led by my same awesome collaborator, Samira Kiani, and following on the subject: this time, our paper shows how to use CRISPR devices to both compute and edit genes in the same circuit.  My reaction, however, has been much more mixed to this publication.  Don't get me wrong: I'm really happy to be published in a high-ranked journal again, and I really enjoy working with Samira (soon to upgrade from Dr. Kiani to Professor Kiani!), who I find an insightful and diligent collaborator and whose skills I think complement my own quite nicely.  Maybe it's just that I can't be so deliriously excited about getting published in a journal a second time?  I'm also not as excited about these results: it's a nice twist on previous results and a useful new capability, but in return we lose some of the device efficacy. Overall, though, I just don't feel like this paper is a game-changer in the way that our first paper might prove to be.

Still, it matters, and it's a step forward for all of us.  Soon, we will meet, celebrate this success with a toast and a fine dinner, and plan our next venture toward transformation of the world and toward posterity.

Sunday, September 06, 2015

Perhaps my least interesting publication ever

Just recently, I was listed as first author (out of five), on what is perhaps the least interesting scientific publication in my history as a researcher.  This includes even semi-embarassing old rants from when I was a young and arrogant graduate student---those at least give some sort of plausibly interesting perspective on what I was thinking about at the time.  Not to say this document isn't important: I think it was definitely worth the time and effort, and is useful.  That doesn't necessarily mean anybody will derive any particular joy or pleasure from encountering it.

So, what is this deadly dull publication that I've for some strange reason decided to advertise so loudly on the Internet?  Its formal name is: BBF RFC 107: Copyright and Licensing of BBF RFCs. This takes a little bit of explanation, so bear with me and please try not to fall asleep too quickly: one of ways that people in the synthetic biology community share their work is by posting open "Request For Comment" documents (RFCs)---essentially draft standards, following the main model used for developing the Internet.  These are cataloged by the BioBricks Foundation, hence BBF RFCs.  The first of these, BBF RFC 0 (yes, there were computer scientists involved, and we like to count starting with zero), sets out the process for how to submit a new RFC.  A few months ago, I noticed that the original handling of copyright had gotten out of date with respect to some current practices in accessing scientific documents online and current preferences for open standards development.  I raised these issues with the BBF RFC maintainers, and we figured out a legal "patch" for BBF RFC 0. The end result of all this is a 1.5 page document that makes two small changes in how new BBF RFC documents are handled:

  • The document is actually marked with a modern open copyright license, and
  • The authors share copyright with the BioBricks Foundation, rather than transferring it.

Now, unfortunately, the parts of BBF RFC 0 that we didn't replace weren't followed correctly in setting forth this RFC, which has caused some trouble with another BBF RFC that I'm involved in, but that story's even less interesting, and I'm sure it will all get sorted out eventually.

In case your eyes have well and truly glazed over, let me sum that all up more simply: I noticed a little thing about copyrighting certain scientific documents that needed tweaking.  By a quirk of process, doing so had the side effect of creating an archival scientific publication.

So, was it worth it?  Absolutely: it didn't take much time, and copyright is one of those things that it's often worth paying close attention to, because if you screw it up as a community, you can accidentally wind up poisoning all sorts of things down the line, if nasty people decide to try to take advantage of loopholes or cautious organizations get blocked from doing things by technicalities.  I'm just quite amused that this ends up in my list of publications as outwardly indistinguishable from BBF RFCs that took many people years of work and that gather lots of citations.  It's also kind of funny from a "what do scientists do all day" perspective.

But, for the love of all that you hold holy, don't read the document unless you actually need to.