[BBF Standards] Amorphous languages for bio fabs

Ralph Santos rasantos at lbl.gov
Mon Feb 25 17:57:33 EST 2008


Hi,

Allow me to thank you as well for the articles you cited in your 
original message.  It explores some really fascinating ideas.

While it seems that the level at which the "Proto" language is operating 
is far above the Biobrick level, I think it does speak to the utility of 
thinking about defining a coherent system abstraction for Biobricks.  
While it is the case that the "operating range" in section 3 of your 
2006 paper is far removed from the behavior from individual biobricks, 
it is clear that the semantics of "Proto" depend upon its operating 
range to support specific behaviors (which Proto composes and modulates 
into higher order operations) and abstracts others away.  Upon further 
reflection, it seems that a number of open questions regarding data 
exchange standards might be clarified if we were to have a system 
abstraction that can be agreed upon as the basis for biobrick data 
formats, etc.

That said, it is obvious we can't approach the Biobrick system model the 
same way one approached defining the operating range for "Proto".  While 
Proto defines its operating medium as an abstract homogeneous cloud of 
tiny computers Biobricks must define their behaviors upon the domain of 
biochemistry and molecular biology.  Also, while a full programming 
language is often used to describe the complete behavior of an abstract 
machine, the job of a Biobrick standard is to describe the composition 
and behavior of individual components or assemblies, so certain things 
one does in a programming language (like defining internal state and 
establishing start and termination conditions for a program run) aren't 
applicable.  My guess is that a Biobrick system abstraction would end 
up  feeling a bit like a  hardware description language like Verilog or 
VHDL in comparison to Proto.

Of course I should make it clear that in citing Verilog or VHDL I'm not 
implying that we should copy their formats, since what I'm interested in 
is a system abstraction, where one is more looking at semantics than 
syntax.  In particular, which I mean by a system abstraction is a 
simplification of molecular biology which can help shape the basic 
datatypes and syntax of data formats and serialized objects to be used 
in a biobrick data exchange standard.

I don't know enough to define all the details of such a system 
abstraction, but a starting point seems to require some characteristics 
or subproblems (subcomponents?) including the following:

* Agreeing upon a vocabulary of chemical species comprising the 
inputs/outputs of Biobricks - so a Biobrick description can state the 
chemical signals it uses to interact with the world

* A simplification of cellular anatomy - so a Biobrick description can 
describe in simple, clearly defined terms both where the brick itself 
fits into a cellular system but also where its inputs and outputs are 
found to interact.

Indeed, many of the terms we talk about assume these as context, though 
as far as I know we've never directly and explicitly discussed the 
structure of this vocabulary.

To explain the two parts more specifically, the chemical species 
vocabulary might include any widely recognized vocabulary of compounds, 
say KEGG LIGAND.

The cellular anatomy vocabulary is harder to explain.  I see the job of 
this vocabulary to capture and describe many of the relationships that 
are currently captured as biobrick type or function information.  In 
particular, it is intended to be a simplified vocabulary of all the 
relevant points in a cell or construct where biobricks or their 
inputs/outputs interact.

Right now my current image of this is an elaboration of the Central 
Dogma which includes a simplified vocabulary to describe both gene 
structure and regulation plus extra terms to describe domains where 
Biobrick-generated proteins are expected to interact.

One has to be able to say a biobrick exists as DNA, RNA or a protein, or 
some organic compound.  Plus, given the biobrick's place in the scheme 
of things, one must be able to express how inputs and outputs relate to 
the biobrick.

So, in the case of a promoter one must be able to refer not only to the 
compound that triggers transcription but also be able to refer to 
"whatever gene is downstream of myself".

The beginnings of such a vocabulary might include:

DNA construct terms:
* Generic terms for operon components and relations between them 
(promoter,repressor,exon,terminator, plus relational terms to describe 
objects up/downstream of the aforementioned components)
* Generic terms for plasmid features (cloning sites, antibiotic 
resistance casettes, etc.)

Cellular domains:
* One should be able to cite that interactions occur at the level of 
DNA, RNA, protein
* One should be able to say that things happen in intra-/extra-cellular 
regions
* One should be able to say whether a signal is expected to come from 
within the same cell or from another cell

There are several ways one could imagine this concept being used.  
Perhaps it can be used to ascribe an ontology to inform or describe what 
is meant by the various biobrick types.  Perhaps it can be used to 
develop a query language to create a machine parseable way to say--for 
example--"I want something that will repress a downstream gene in 
response to compound X", or "I want two parts, one that suppresses 
transcription in response to an extracellular signal and another part 
which promote transcription using the same signal as the first part".

Actually, there's a lot more work to be done to flesh it out into 
something useable by the Biobrick standards group.  But seeing how the 
Proto language builds on the problem did suggest an interesting angle 
from which to consider Biobrick description.

Sorry, I sort of rambled and sashayed into a completely different 
subject.  In any case it did lend food for thought.  Thanks again for 
the article.

---ralf

Jake Beal wrote:
>> Very interesting indeed. Thanks for the links. But I am somewhat 
>> confused about the proposed compiler tool chain. I'll give a more 
>> thorough read over the PDF soon. I gave this general topic some more 
>> thought this morning, and was unfortunately offline, but now that I am 
>> back on I have decided to throw up my notes on a wiki, but since they 
>> are so very informal I have placed them over at biohack instead of OWW:
>> http://biohack.sourceforge.net/wiki/index.php/Biobricks#2008-02-20
>>     
>
> The key idea is the separation of the problem into loosely coupled
> layers.  The amorphous medium abstraction and Proto language bridge
> the global-to-local gap, making it easier to design distributed
> algorithms for a spatially extended systems like an embryo.
>
> In the end, though, the language does not "solve" the problems of
> robust distributed assembly.  Instead, the global-to-local bridge
> sweeps aside much of the routine work of distributed programming, and
> imposes a discipline that prevents accidental mixing of levels.  In
> our experience, this makes it easier for the programmer to grapple
> with robust assembly problems directly, and to trust that the
> solutions they find are likely to generalize well and compose nicely
> with one another.
>
> In this framework, creating an appropriate toolkit for robust
> distributed assembly is an exercise in library building, rather than
> language design.  But, of course, the difference between those two is
> just a matter of perspective, especially in the eyes of a LISP hacker.
> :-)
>
>
> The real problem that we face in moving to the higher level components
> you describe, like "OrganBricks," is not how to build them, but how to
> describe what is the thing that we actually want.  Take, for example,
> the "make skull" command from the example in your Amorphous Compilation
> section.  What does it mean to have made a good skull?  
>
> All animals use approximately the same skull-building program, and it
> is clearly not made by a CAD-style blueprint, for it changes too
> easily to accomodate other changes in the structure of the organism.
> Unless we have a standard for comparing skull-building programs and
> knowing which ones are better, we cannot debug our ideas about
> skull construction.  The questions that we need to consider include:
>
> * How is the shape of the skull determined?
> * How should the location of holes in the skull be determined?
> * How should thickness relate to head size?
> * What sort of flaws are OK?  What sort of flaws are bad?
> * How should the skull change as the organism grows?
> * How should the skull respond to flaws in constructing nearby parts?
>
> The answers to these questions will tell us much of what we need to
> know about how to write a spatial program that builds a skull.  The
> advantage of a language like Proto is that we can expect a high
> percentage of code to be derived directly from the answers to these
> questions.
>
> Thanks,
> -Jake
>
> _______________________________________________
> Standards mailing list
> Standards at biobricks.org
> http://biobricks.org/mailman/listinfo/standards_biobricks.org
>   




More information about the Standards mailing list