ACEDB User Group Newsletter - April 2004

If you want to have this newsletter mailed to you or you want to make comments/suggestions about the format/content then send an email to acedb@sanger.ac.uk.


This month sees the latest AceDB builds now available on the Mac, a change to AceDB version number format, improvements to the Tag Chooser, improvements to DNA exporting and improvements/bug fixes for the AceC interface.


New Features

Latest AceDB builds on MAC OSX

(This work is courtesy of Rob Clack rnc@sanger.ac.uk)

After a great deal of effort fiddling with compiler options, the OSX environment and FINK, the latest AceDB builds are available on the Mac. Go to the downloads page to get hold of the new binaries.

AceDB Version Numbering

Conventionally Acedb versions have been given as:

        <version>_<release><update>    e.g.  "4_9t"

We have finally run out of update letters for 4_9 so we have switched to the now recognised convention of:

        <version>.<release>.<update>   e.g.  "4.9.27"

You should take note of this if you have code that in any way parses the AceDB version number and uses it.

Improvement in Tag Chooser

(This article is courtesy of Jean Thierry-Mieg mieg@ncbi.nlm.nih.gov)

The tag chooser is used in 2 situations, when you update an object or when you contruct a Table-Maker query.

In both cases, a window pops up with light blue boxes showing the complete acedb model for the releavant class. The only thing you have to do, is to select the relevant box and double click it with the left mouse button. In object update mode, this will add the tag to your object, in Table-Maker, this will select it for display in the column you are constructing.

Unfortunately, it is sometimes frustratingly hard to locate the relevant tag, because some of the acedb models are very large, even though, in your database, you may well only use a small number of them.

However, a few months ago, I introduced in the Tree display a new toggle, available from the right mouse, which shows for each tag present in your object, how many times it is present in the complete database. This is particularily useful when you display the Model, since it then shows how many times each tag present in the database is used, and hence how many tags are never used !

Well, this month, we decided to use the Tag-Count facility to improve the interface of the tag-Chooser. From now on, when you update and object (CTRL-U) or when you choose a Tag in the Table-Maker graphic interface, the tag chooser will skip the Tags which are never used in your database. Moreover, in Table-Maker, when you are searching Right-of a previous column (Button From/Right of), the tag chooser will now only propose the relevant section of the schema, i.e. the branch right of tag selected in the parent column.

In both cases, you may still access all possible tags by using the new button 'Show all tags/Limit to known tags' proposed in the tag-chooser window. This is handy when you wish to add for the first time a new tag in object-update mode or when you construct a Table definition to be used in the future when the data will actually be present.

The same facility could be added in query by example or in query builder if there were some demand.

Please send feedback if you like or dislike this new option to acedb@sanger.ac.uk.

Exporting a stretch of DNA

(This article is courtesy of Jean Thierry-Mieg mieg@ncbi.nlm.nih.gov)

Let us recall that, given an active set of objects having DNA, either directly or indirectly, the command

  acedb> dna

dumps the corresponding DNA in fasta format. For example, if you have a mixed keyset containing for example mRNA sequences, genes, cosmids, chromosomes assembled from BACs or so, the above command dumps in a single fasta file all the corresponding sequences. This command may be used for example to reconstruct the sequence of a complete chromosome which has been entered in acedb as just a scaffold of cosmids, provided the sequence of most of the cosmids are known in the database. Unsequenced cosmids will then be represented as --- . If there are overlapping cosmids and they disagree in some places, the program will report an error. But using the option

  acedb> dna -mismatch

will lead to dumping the whole chromosome with clear indication of the mismatches (e.g T OR C will be represented by the corresponding letter Y, A OR T by W , following the standard sixteen letter code)

Three other options allow to restrict the export to specific parts of the transcript(s): you can choose to export the spliced transcript (this is the default option) using the command

  acedb> dna -spliced
the premessenger (unspliced transcribed region) using
  acedb> dna -unspliced

or the protein coding part of the transcript using

  acedb> dna -cds .

Finally, the option

  acedb> dna -f filename

redirects the output to the named file.

--------

We have encountered the need to export only selected fragments of sequences, for example the 100 last basepairs of a collection of transcripts, say to analyse polyadenylation signals. To support this option, we added to the dna command a new pair of parameters -x1 and -x2 that specify the beginning and end of the fragments to export.

  acedb> dna -x1 u1 -x2 u2

The values u1 and u2 are either base numbers and they will be automatically limited to be within the DNA fragment (value from 1 to DNA length), or the word 'begin' or the word 'end' . If you choose to specify -x1 or -x2, you must specify both.  Also, the strand of the fragments exported are determined by the values of u1 and u2. If u1 < u2, the exported sequence is direct; if u1 > u2, the sequence is reverse complemented before the dump.

examples:

  acedb> find sequence dna1
  // 1 Active Objects
  acedb> dna  // the complete sequence in fasta format
  >dna1
  tcagtcagtcagtcagtcag  
  acedb> dna -x1 end -x2 begin // same, but reverse complemented
  >dna1
  ctgactgactgactgactga  
  acedb> dna -x1 12 -x2 end  // the end of the sequence
  >dna1
  gtcagtcag  
  acedb> dna -x1 5 -x2 1 // the first 5 bp, reverse complemented
  >dna1
  actga 

This option is also available from the AceC interface, using the new command:

ac_zone_dna (sequence, u1, u2, memory_handle)


Bugs Fixed

AceC enhancement

The AceC, C language programmers interface to acedb, has been upgraded. There are minor modifications to a few function calls and a few bugs have been found and fixed.

For details, please read wac/ac.h wac/README or contact Jean Thierry-Mieg mieg@ncbi.nlm.nih.gov.


Developers Corner

The CVS directory for AceDB proposals

Just a brief reminder that if you have written up a proposal for AceDB then a good place to put it is in:

    winfo/Proposals/     e.g. 

try to put the file into one of the existing subdirectories:

    winfo/Proposals/session/rotating_logs

A good tool for parsing changes in the CVS repository

In constructing this newsletter I scan all the changes made in the last month, recently I have been using a free tool found on the web called "cvs2cl". This is a perl script which turns the output from CVS into something more readable. You can see how I make use of this tool by looking in the script I run it from:

    winfo/Newsletters/cvschanges


April monthly build now available.

You can pick up the monthly builds from:

Sanger users
~acedb/RELEASE.DEVELOPMENT
External users
http://www.acedb.org/Software/Downloads/monthly.shtml


Ed Griffiths <edgrif@sanger.ac.uk>
Last modified: Thu Jul 1 14:56:49 BST 2004