[BBF Standards] data exchange issue 1: Abstraction

Ralph Santos rasantos at lbl.gov
Tue Feb 26 11:54:48 EST 2008


Ah, perfect opportunity! (insert grin and sound of rubbing hands 
together here)

A quick text search yields BBa_T4001, BBa_I10099 and BBa_J56012, among 
others.

One thing that's not clear to me is that I don't see anything that says 
it includes the full "FimS" invertible sequence, and it's not clear to 
me at the moment whether BBa_J56012 is FimS or some piece or 
modification of it.

However, BBa_J56012 is sufficient to illustrate the problem.  Here's 
some questions:

* Let's say we perform an inversion on BBa_J56012.  Is it still 
BBa_J56012?  Is it still even a biobrick?
* Let's say someone wishes to use BBa_J56012, but wants to use it in its 
inverted form, do they refer to BBa_J56012?

---ralf


Drew Endy wrote:
> A BioBrick part is any basic biological function that can be encoded  
> as DNA.
>
> More specifically, I believe that there are already parts in the  
> Registry encoding components from the Fim system, as well as sites  
> recognized and acted on by invertases.
>
> I'll try to dig up the part numbers once I get back from teaching class.
>
> Expect more emails from me today about Saturday's workshop too!
>
>
>
> On Feb 26, 2008, at 10:59 AM, Deepak Chandran wrote:
>
>   
>> Concerning systems such as Fim, I don't think that they are individual
>> parts. Rather, they are a composition of parts (i.e. "devices").  
>> Hence,
>> the whole Fim system should not be stored as a single part. Other
>> systems, such as the Sin operon, also have multiple coding regions  
>> with
>> promoters in between them -- I don't think that they should be
>> considered "parts".
>>
>> Ralph Santos wrote:
>>     
>>> Sorry, I should have responded initially by responding to your  
>>> direct question rather than trying to rethink everything.   
>>> Regardless of what you think of alternative strategies, the current  
>>> representation should be explored fully, and pretty much any data  
>>> exchange system regarding biobricks has to deal with sequences at  
>>> some point.
>>>
>>> Considering the unresolved questions:
>>>
>>>       
>>>> From the way I read your points, it seems that whether points (A)  
>>>> and (B) are related depends upon how one defines the relationship  
>>>> between a biobrick and its sequence.
>>>>         
>>> With part (A), it occurs to me that there's several ways to deal  
>>> with the unresolved question.  One can directly attack it and  
>>> resolve it here and now, or one can find a way to allow the process  
>>> to move forward without resolving it.
>>>
>>> As you rightly point out, defining a biobrick in terms of its  
>>> sequence raises a bunch of issues, but is it absolutely necessary  
>>> to define a biobrick in terms of its sequence?  If we define the  
>>> standard to demand this assertion, then the answer is yes, but one  
>>> can define the data exchange process so that one can dance around  
>>> the issue.
>>>
>>> One simple way to dance around the issue is to stipulate that  
>>> biobrick data exchange systems must assign every sequence its own  
>>> accession ID so that it can be identified independently from the  
>>> biobrick with which it's associated.  For the sake of simplicity,  
>>> we can bury this ID so that you don't have to deal with it if you  
>>> don't want to, while still providing some means to get these  
>>> sequence ID's if you want them.
>>>
>>> In this scheme, the rules for managing biobrick sequences would be  
>>> governed by the principle that you should be able to pretend  
>>> there's only one sequence associated with a biobrick.  So no matter  
>>> how many sequences are associated with a biobrick, only one will be  
>>> regarded as primary, to be returned with standard queries.  Digging  
>>> into the other sequences should require some sort of specialized  
>>> call specifically defined to retrieve these other sequences.  Of  
>>> course, deleting a biobrick should imply that all of the sequences  
>>> associated with it are deleted.
>>>
>>> I guess this solution most closely resembles what you refer to as a  
>>> 'biobrick implementation' layer underneath.  I like this way of  
>>> handling it because it allows folks who wish to assume a simple one- 
>>> to-one relationship between bricks and sequences to think that way,  
>>> but it allows the distinction to be monitored and dealt with behind  
>>> the scenes.
>>>
>>> Having said all that, there is a situation worth thinking about,  
>>> because it favors the family concept and it forces one to think  
>>> about what the sequence means.  There are systems like the "fim"  
>>> system in E. Coli:
>>>
>>>  http://jb.asm.org/cgi/content/full/182/10/2953
>>>
>>> Basically it activates transcription for one of two genes depending  
>>> upon its orientation.  If one thinks of a sequence as defining a  
>>> molecule, something like the fim system causes some headaches.  Two  
>>> people could put in what amounts to the same molecule in two  
>>> different orientations and the system would not be able to  
>>> recognize it as such.  Of course if one imagines biobricks being  
>>> made incorporating multiple fim elements one can see the number of  
>>> possible sequences for the same brick growing combinatorially.  It  
>>> may be rare enough that the issue may be safely ignored (I don't  
>>> know myself).  One may wish to use the brick family idea to try and  
>>> capture all the possible sequence states or configurations.
>>>
>>> If in part (B) by "format" you're referring to the actual file  
>>> format (i.e. FASTA vs. GENBANK vs. EMBL etc.) I'd prefer to see us  
>>> deal with that as a matter of presentation to the user, and not  
>>> consider sequence formatting as part of the definition of a  
>>> sequence (or biobrick for that matter).  As far as tools go, It  
>>> would simplify things if we focused on tools doing most of their  
>>> processing in a single canonical format, and stipulate that there  
>>> will be conversion tools to preprocess inputs into that canonical  
>>> format and postprocess them however the user wishes to see them.   
>>> That would also fit well with the IETF's principle of being liberal  
>>> with what one accepts for input, being conservative with what one  
>>> emits for output.
>>>
>>>
>>>
>>> ----- Original Message -----
>>> From: Raik Gruenberg <raik.gruenberg at crg.es>
>>> Date: Monday, February 25, 2008 3:18 pm
>>> Subject: [BBF Standards] data exchange issue 1: Abstraction
>>> To: standards at biobricks.org
>>>
>>>
>>>       
>>>> Hi all,
>>>>
>>>> your 1st of March meeting is approaching and I would like to focus
>>>> the
>>>> discussion back to more mundane data exchange problems -- let's try
>>>> to have some
>>>> draft ready which is covering:
>>>>
>>>> (1) Aim / Application scenarios for this standard
>>>> (2) What is a Biobrick?
>>>> (3) Datamodel 1 -- minimal Biobrick information
>>>>
>>>> This should be doable in the remaining time (just think: "yes we
>>>> can!" :-)
>>>>
>>>> Let's start with (2). The current state of debate is as always here:
>>>> http://openwetware.org/wiki/The_BioBricks_Foundation:Standards/Technical/Exchange#What_is_a_Biobrick.3F
>>>>
>>>> We have two (related) unresolved questions in this section:
>>>>
>>>> (A) If a Biobrick is defined by its unique sequence, already a
>>>> minor variation
>>>> (e.g. silent mutation) of this sequence creates an entirely new
>>>> biobrick. We may
>>>> want to create the concept of Biobrick-families on top of the
>>>> Biobrick or one
>>>> could introduce a 'biobrick implementation' layer underneath.
>>>>
>>>> (B) Even worse, what happens if I put the exactly same sequence
>>>> into a different
>>>> format? Is that a new Biobrick? -- Only then can we include a
>>>> "Format" record in
>>>> the minimal data model. For the experimentalist this information is
>>>> essential.
>>>> Moreover, I think we need a second definition of "Device" (built
>>>> from
>>>> biobricks). Some of the discussions we had here may overlook that a
>>>> single
>>>> Biobrick will often *not* encapsulate a self-standing function but
>>>> will often
>>>> depend on additional biobricks that are not fused to it on the DNA
>>>> level and may
>>>> even be distributed over different cells (e.g. the quorum sensing
>>>> device). I
>>>> think we should treat this with a separate "Device" concept. Some
>>>> Biobricks may
>>>> turn out to be self-sufficient devices, most others not. Some
>>>> functional
>>>> characterizations (Pops-in/out) only make sense for full devices,
>>>> but not for
>>>> every single Biobrick.
>>>>
>>>> Comments?
>>>> Greetings,
>>>> Raik
>>>>
>>>> -- 
>>>> ________________________________
>>>>
>>>> Dr. Raik Gruenberg
>>>> http://www.raiks.de/contact.html
>>>> ________________________________
>>>>
>>>> _______________________________________________
>>>> Standards mailing list
>>>> Standards at biobricks.org
>>>> http://biobricks.org/mailman/listinfo/standards_biobricks.org
>>>>
>>>>
>>>>         
>>> _______________________________________________
>>> Standards mailing list
>>> Standards at biobricks.org
>>> http://biobricks.org/mailman/listinfo/standards_biobricks.org
>>>
>>>       
>> _______________________________________________
>> Standards mailing list
>> Standards at biobricks.org
>> http://biobricks.org/mailman/listinfo/standards_biobricks.org
>>     
>
>
> _______________________________________________
> Standards mailing list
> Standards at biobricks.org
> http://biobricks.org/mailman/listinfo/standards_biobricks.org
>   




More information about the Standards mailing list