[BBF Standards] Amorphous languages for bio fabs
Ralph Santos
rasantos at lbl.gov
Mon Feb 25 17:57:33 EST 2008
Hi,
Allow me to thank you as well for the articles you cited in your
original message. It explores some really fascinating ideas.
While it seems that the level at which the "Proto" language is operating
is far above the Biobrick level, I think it does speak to the utility of
thinking about defining a coherent system abstraction for Biobricks.
While it is the case that the "operating range" in section 3 of your
2006 paper is far removed from the behavior from individual biobricks,
it is clear that the semantics of "Proto" depend upon its operating
range to support specific behaviors (which Proto composes and modulates
into higher order operations) and abstracts others away. Upon further
reflection, it seems that a number of open questions regarding data
exchange standards might be clarified if we were to have a system
abstraction that can be agreed upon as the basis for biobrick data
formats, etc.
That said, it is obvious we can't approach the Biobrick system model the
same way one approached defining the operating range for "Proto". While
Proto defines its operating medium as an abstract homogeneous cloud of
tiny computers Biobricks must define their behaviors upon the domain of
biochemistry and molecular biology. Also, while a full programming
language is often used to describe the complete behavior of an abstract
machine, the job of a Biobrick standard is to describe the composition
and behavior of individual components or assemblies, so certain things
one does in a programming language (like defining internal state and
establishing start and termination conditions for a program run) aren't
applicable. My guess is that a Biobrick system abstraction would end
up feeling a bit like a hardware description language like Verilog or
VHDL in comparison to Proto.
Of course I should make it clear that in citing Verilog or VHDL I'm not
implying that we should copy their formats, since what I'm interested in
is a system abstraction, where one is more looking at semantics than
syntax. In particular, which I mean by a system abstraction is a
simplification of molecular biology which can help shape the basic
datatypes and syntax of data formats and serialized objects to be used
in a biobrick data exchange standard.
I don't know enough to define all the details of such a system
abstraction, but a starting point seems to require some characteristics
or subproblems (subcomponents?) including the following:
* Agreeing upon a vocabulary of chemical species comprising the
inputs/outputs of Biobricks - so a Biobrick description can state the
chemical signals it uses to interact with the world
* A simplification of cellular anatomy - so a Biobrick description can
describe in simple, clearly defined terms both where the brick itself
fits into a cellular system but also where its inputs and outputs are
found to interact.
Indeed, many of the terms we talk about assume these as context, though
as far as I know we've never directly and explicitly discussed the
structure of this vocabulary.
To explain the two parts more specifically, the chemical species
vocabulary might include any widely recognized vocabulary of compounds,
say KEGG LIGAND.
The cellular anatomy vocabulary is harder to explain. I see the job of
this vocabulary to capture and describe many of the relationships that
are currently captured as biobrick type or function information. In
particular, it is intended to be a simplified vocabulary of all the
relevant points in a cell or construct where biobricks or their
inputs/outputs interact.
Right now my current image of this is an elaboration of the Central
Dogma which includes a simplified vocabulary to describe both gene
structure and regulation plus extra terms to describe domains where
Biobrick-generated proteins are expected to interact.
One has to be able to say a biobrick exists as DNA, RNA or a protein, or
some organic compound. Plus, given the biobrick's place in the scheme
of things, one must be able to express how inputs and outputs relate to
the biobrick.
So, in the case of a promoter one must be able to refer not only to the
compound that triggers transcription but also be able to refer to
"whatever gene is downstream of myself".
The beginnings of such a vocabulary might include:
DNA construct terms:
* Generic terms for operon components and relations between them
(promoter,repressor,exon,terminator, plus relational terms to describe
objects up/downstream of the aforementioned components)
* Generic terms for plasmid features (cloning sites, antibiotic
resistance casettes, etc.)
Cellular domains:
* One should be able to cite that interactions occur at the level of
DNA, RNA, protein
* One should be able to say that things happen in intra-/extra-cellular
regions
* One should be able to say whether a signal is expected to come from
within the same cell or from another cell
There are several ways one could imagine this concept being used.
Perhaps it can be used to ascribe an ontology to inform or describe what
is meant by the various biobrick types. Perhaps it can be used to
develop a query language to create a machine parseable way to say--for
example--"I want something that will repress a downstream gene in
response to compound X", or "I want two parts, one that suppresses
transcription in response to an extracellular signal and another part
which promote transcription using the same signal as the first part".
Actually, there's a lot more work to be done to flesh it out into
something useable by the Biobrick standards group. But seeing how the
Proto language builds on the problem did suggest an interesting angle
from which to consider Biobrick description.
Sorry, I sort of rambled and sashayed into a completely different
subject. In any case it did lend food for thought. Thanks again for
the article.
---ralf
Jake Beal wrote:
>> Very interesting indeed. Thanks for the links. But I am somewhat
>> confused about the proposed compiler tool chain. I'll give a more
>> thorough read over the PDF soon. I gave this general topic some more
>> thought this morning, and was unfortunately offline, but now that I am
>> back on I have decided to throw up my notes on a wiki, but since they
>> are so very informal I have placed them over at biohack instead of OWW:
>> http://biohack.sourceforge.net/wiki/index.php/Biobricks#2008-02-20
>>
>
> The key idea is the separation of the problem into loosely coupled
> layers. The amorphous medium abstraction and Proto language bridge
> the global-to-local gap, making it easier to design distributed
> algorithms for a spatially extended systems like an embryo.
>
> In the end, though, the language does not "solve" the problems of
> robust distributed assembly. Instead, the global-to-local bridge
> sweeps aside much of the routine work of distributed programming, and
> imposes a discipline that prevents accidental mixing of levels. In
> our experience, this makes it easier for the programmer to grapple
> with robust assembly problems directly, and to trust that the
> solutions they find are likely to generalize well and compose nicely
> with one another.
>
> In this framework, creating an appropriate toolkit for robust
> distributed assembly is an exercise in library building, rather than
> language design. But, of course, the difference between those two is
> just a matter of perspective, especially in the eyes of a LISP hacker.
> :-)
>
>
> The real problem that we face in moving to the higher level components
> you describe, like "OrganBricks," is not how to build them, but how to
> describe what is the thing that we actually want. Take, for example,
> the "make skull" command from the example in your Amorphous Compilation
> section. What does it mean to have made a good skull?
>
> All animals use approximately the same skull-building program, and it
> is clearly not made by a CAD-style blueprint, for it changes too
> easily to accomodate other changes in the structure of the organism.
> Unless we have a standard for comparing skull-building programs and
> knowing which ones are better, we cannot debug our ideas about
> skull construction. The questions that we need to consider include:
>
> * How is the shape of the skull determined?
> * How should the location of holes in the skull be determined?
> * How should thickness relate to head size?
> * What sort of flaws are OK? What sort of flaws are bad?
> * How should the skull change as the organism grows?
> * How should the skull respond to flaws in constructing nearby parts?
>
> The answers to these questions will tell us much of what we need to
> know about how to write a spatial program that builds a skull. The
> advantage of a language like Proto is that we can expect a high
> percentage of code to be derived directly from the answers to these
> questions.
>
> Thanks,
> -Jake
>
> _______________________________________________
> Standards mailing list
> Standards at biobricks.org
> http://biobricks.org/mailman/listinfo/standards_biobricks.org
>
More information about the Standards
mailing list