Wednesday, September 30, 2015

Aggregate Programming!

It's finally out: our IEEE Computer article, "Aggregate Programming for the Internet of Things" (free preprint) is up and online where all can get it for free.  Don't be fooled by the name: this isn't really just about the our increasingly networked possessions  ("the Internet of Things"), it's a much more general paper.  In fact, this is the first place we've really put all the pieces of our last few years' distributed systems research together, into a generally accessible article that clearly introduces a better framework for building distributed systems.  Please allow me to introduce the aggregate programming stack:

Aggregate programming stack, with examples from a crowd safety application.
In computer networks, the OSI model is a "stack" of abstraction layers that separate different aspects of computer communication.  The browser you are reading this on, for example, is probably obtaining it via HTTP at the Application Layer, routed to you via TCP/IP at the Transport and Network Layers, respectively, with the last link sent to you over something like 802.11 or Ethernet handling the Data Link and Physical layers.

Our aggregate programming model takes a similarly layered approach to the problems of designing networked systems, breaking these often extremely complex problems into five layers.  From bottom to top, these are:

  1. Device: this is the collection of actual electronic devices that comprise the system, with their various built-in sensors, actuators, ability to communicate with one another, etc.
  2. Field Calculus: this layer abstracts the devices into a simple (but universal) virtual model that can freely be mapped between an "aggregate" perspective in which the whole system acts like a single unified device, and a "local" perspective of individual device interactions that implement this model.
  3. Resilient Coordination: this layer consists of "building block" algorithms (implemented in field calculus) that provide guarantees that systems will be safe, resilient, and adaptive in various ways.
  4. Developer APIs: useful patterns and combinations of building blocks are then named and collected into application programming interfaces (APIs) that are easier to think about and program with.
  5. Application: Finally, distributed systems can be much more easily constructed, using the APIs just like one would any other single-machine library.
Constructing this stack factors the problems of distributed systems development into separable components, each much simpler than trying to tackle the whole complicated mess at once.  If you just want to build applications, you just need to learn the Developer API layer and work with that, just like web programmers learn about HTML and Javascript.  If you want to work on resilient algorithms, on the other hand, you get involved with the plumbing at the Resilient Coordination layer, and if you want to use the stack on a new device, you implement a copy of the interface required for a Device by an instance of the Field Calculus layer.

I'm very proud of this work, and think it's got a potential to really change the way that people deal with complex computer networks.  For the programmers amongst you, dear readers, I suggest you check out both this paper and our (still somewhat rough) implementation of field calculus in Protelis.

Saturday, September 19, 2015

A Golden Boston Sunset

Dear readers,

Some days, it's just good to be alive, and a thing comes out of nowhere unexpectedly to remind you of that fact.  Today, as my flight was gliding down into its final descent into Boston, the air was almost perfectly clear and the sun was just at that magic moment in its descent where everything begins to be golden and shadows stretch out just enough to give the third dimension of everything an extra bit of special emphasis.  As I gloried in the texture of the light, my camera came out and I snapped away---not blocking myself from an enjoyment of this sight, but finding that the aim to capture gave me extra focus and appreciation for the details.

My dear readers, I wish to share this joy with you, in the form of a few of the best moments of imagery I captured.  May this lighten your day as it has lightened mine.





Monday, September 14, 2015

A Tale of two CRISPRs

Last year, I had my first "glamour journal" publication, as second author of a Nature Methods paper on a new family of CRISPR-based synthetic regulatory devices.  Actually, I had two "glamour" publications---the other was a Nature Biotech paper on the SBOL language for communicating biological designs with 32 authors, the biggest collaborative publication I've been involved in to date.  That's a tale for another post, however---this one's all about CRISPR, CRISPR, CRISPR.

For those who haven't encountered the wonderful hype-storm around CRISPR, the acronym expands to the highly non-mellifluous "clustered regularly interspaced palindromic repeats," which tells you virtually nothing about why it's cool.  The reason it's cool is because one of the things this awkward acronym refers to a protein ("Cas9") that docks with fairly arbitrary "guide RNA" fragments in order to go act on DNA that matches those sequences.

Core CRISPR mechanism: Cas9 protein binds to gRNA, which targets the protein to a matching DNA sequence

Protein design is really hard, but DNA and RNA design has become reasonably straightforward, so CRISPR is an awesome mechanism: it lets us target a (fairly) predictable protein effect to pretty much any piece of DNA that we want.  People have used it for editing DNA, which has previously been done with lots of other mechanisms, but gets much easier with CRISPR (hence the recent controversies you may have seen in the news around human genetic engineering---the changes we can do aren't any different, they're just a lot cheaper, which is a meaningful difference of a different sort).

Our paper last year showed for the first time how to use the CRISPR mechanisms to make potentially large numbers of strong biological logic gates.  This is important because one of the big things that's been holding synthetic biology back is the difficulty in building reliable computation and control systems inside of cells.  We've known for a long time that biological computing is possible, but there's only been a handful of decent computational devices, and no good ways of making more.  Now, within the last few years, there have been several different families that have emerged, including TALE proteins, homolog mining, invertases, and now, with our paper, CRISPR repressors.  Our CRISPR repressors are nice because they can potentially easily generate thousands of high-performance devices and implement all sorts of complex computations, something that nobody currently has a clear approach for with any of the other families.

Diagram of one of our CRISPR repressors: a modified Cas9 protein (blue box) acts as "power supply" for an inverter logic gate implemented by having the gRNA (orange box) regulate a synthetic promoter (blue arrow). The important things to know are 1) the orange box and blue arrow are easy to design and we can make lots of them that don't interfere with one another, and 2) the blue box can potentially power lots of these gates at the same time.

So I was (and still am) very excited about this publication for two reasons: first because I think it's a big step forward scientifically, and second because it's in a big-name venue that lots of people are likely to pay attention to and where it's more likely to have a big impact on scientific practice.

Just last week, I had my second paper in Nature Methods, led by my same awesome collaborator, Samira Kiani, and following on the subject: this time, our paper shows how to use CRISPR devices to both compute and edit genes in the same circuit.  My reaction, however, has been much more mixed to this publication.  Don't get me wrong: I'm really happy to be published in a high-ranked journal again, and I really enjoy working with Samira (soon to upgrade from Dr. Kiani to Professor Kiani!), who I find an insightful and diligent collaborator and whose skills I think complement my own quite nicely.  Maybe it's just that I can't be so deliriously excited about getting published in a journal a second time?  I'm also not as excited about these results: it's a nice twist on previous results and a useful new capability, but in return we lose some of the device efficacy. Overall, though, I just don't feel like this paper is a game-changer in the way that our first paper might prove to be.

Still, it matters, and it's a step forward for all of us.  Soon, we will meet, celebrate this success with a toast and a fine dinner, and plan our next venture toward transformation of the world and toward posterity.

Sunday, September 06, 2015

Perhaps my least interesting publication ever

Just recently, I was listed as first author (out of five), on what is perhaps the least interesting scientific publication in my history as a researcher.  This includes even semi-embarassing old rants from when I was a young and arrogant graduate student---those at least give some sort of plausibly interesting perspective on what I was thinking about at the time.  Not to say this document isn't important: I think it was definitely worth the time and effort, and is useful.  That doesn't necessarily mean anybody will derive any particular joy or pleasure from encountering it.

So, what is this deadly dull publication that I've for some strange reason decided to advertise so loudly on the Internet?  Its formal name is: BBF RFC 107: Copyright and Licensing of BBF RFCs. This takes a little bit of explanation, so bear with me and please try not to fall asleep too quickly: one of ways that people in the synthetic biology community share their work is by posting open "Request For Comment" documents (RFCs)---essentially draft standards, following the main model used for developing the Internet.  These are cataloged by the BioBricks Foundation, hence BBF RFCs.  The first of these, BBF RFC 0 (yes, there were computer scientists involved, and we like to count starting with zero), sets out the process for how to submit a new RFC.  A few months ago, I noticed that the original handling of copyright had gotten out of date with respect to some current practices in accessing scientific documents online and current preferences for open standards development.  I raised these issues with the BBF RFC maintainers, and we figured out a legal "patch" for BBF RFC 0. The end result of all this is a 1.5 page document that makes two small changes in how new BBF RFC documents are handled:

  • The document is actually marked with a modern open copyright license, and
  • The authors share copyright with the BioBricks Foundation, rather than transferring it.

Now, unfortunately, the parts of BBF RFC 0 that we didn't replace weren't followed correctly in setting forth this RFC, which has caused some trouble with another BBF RFC that I'm involved in, but that story's even less interesting, and I'm sure it will all get sorted out eventually.

In case your eyes have well and truly glazed over, let me sum that all up more simply: I noticed a little thing about copyrighting certain scientific documents that needed tweaking.  By a quirk of process, doing so had the side effect of creating an archival scientific publication.

So, was it worth it?  Absolutely: it didn't take much time, and copyright is one of those things that it's often worth paying close attention to, because if you screw it up as a community, you can accidentally wind up poisoning all sorts of things down the line, if nasty people decide to try to take advantage of loopholes or cautious organizations get blocked from doing things by technicalities.  I'm just quite amused that this ends up in my list of publications as outwardly indistinguishable from BBF RFCs that took many people years of work and that gather lots of citations.  It's also kind of funny from a "what do scientists do all day" perspective.

But, for the love of all that you hold holy, don't read the document unless you actually need to.