[Genome] Obtaining EST transcription orientation info
Robert Kuhn
kuhn at soe.ucsc.edu
Tue Dec 16 11:36:13 PST 2003
Maxim,
The sequence has a score of zero because the intron that becomes
obvious when the EST is aligned to the genome does not meet the
canonical criteria of NNNNexonNNNNgtnnnnintronnnnnnnnagNNNNexon,
where the two nucleotides on either end of the intron give an
unambiguous gt-----------ag. The algorithm that assigns direction
to an intron relies on finding the gt------ag in one direction
more often than in the other. That is what I meant by,
"The magnitude
> > of the number gives the number of introns in that direction minus the
> > number of introns that calculated out in the opposite direction."
Looked at another way, viewing the EST sequence itself:
gt/ag introns minus ct/ac introns = intronOrientation.
If intronOrientation is positive, the intron gets chevrons pointing in the same
direction as the EST, as described in my earlier message.
For ESTs that span only one (or no) introns, the score might be zero if
the intron does not meet either of these criteria (in your case, the
sequence of the only intron is gg/ag)
If intronOrientation is zero, the alignment is decided by the "direction"
field of the mrna table (which also includes ESTs). If direction = 3,
then the chevrons point opposite the sequence. Because your sequence
aligns to the + strand and has direction = 3 in the mrna table, the
chevrons point to the left. You should be suspicious of any assignment
based on this last criterion, as the 3' and 5' designations on ESTs
(particularly the 3') are sometimes reported as the reverse-complement.
For example, if an EST is reported as from the 3' end and yet submitted
as the reverse-complement (to coincide with the direction of the proposed
mRNA), the chevrons would point in the wrong direction.
--b0b kuhn
> > > From maxim.shklar at weizmann.ac.il Tue Dec 16 02:38:41 2003
> > > To: <genome at soe.ucsc.edu>, "Robert Kuhn" <kuhn at cse.ucsc.edu>
> > > Cc: "Tsippi" <tsippi at dandesign.co.il>,
> > > "Orit Shmueli" <orit.shmueli at weizmann.ac.il>,
> > > "liora strichman-almashanu" <lioraa at wicc.weizmann.ac.il>,
> > > <marilyn.safran at weizmann.ac.il>
> > > Subject: Re: [Genome] Obtaining EST transcription orientation info
> > >
> > > Dear Robert,
> > >
> > > Thank you very much for your help. All the information you've given me so
> > > far has helped clarify a lot of subtle details.
> > >
> > > I have found an example - EST AA679564 (about which I have asked in the
> > > past) that aligns in the + direction to the genome, but is drawn with the
> > > chevrons pointing towards the negative direction (pointing left). I was
> > > hoping to confirm that its entry in the intronOrientation field of the
> > > estOrientInfo would have a '-' sign, but it is 0 instead, which means it
> > > should have been pointing right (if I understand correctly), or that there
> > > is additional orientation information stored elsewhere :
> > >
> > > #bin chrom chromStart chromEnd name intronOrientation sizePolyA revSizePolyA
> > > signalPos revSignalPos
> > > 1137 chr13 72459490 72460419 AA679564 0 0 0 0 0
> > >
> > > My question is - how do you decide the orientation of an EST? Is this
> > > information stored only in 'estOrientInfo', and if not, what else do I need
> > > to download?
> > >
> > > Thank you once again,
> > >
> > > Maxim Shklar
> > > GeneCards team
> > > Department of Molecular Genetics, Levine 109
> > > Weizmann Institute of Science
> > > maxim.shklar at weizmann.ac.il
> > > Tel: 08-9344406 053-382014 , Fax: 08-9344113
> > > ----- Original Message -----
> > > From: "Robert Kuhn" <kuhn at cse.ucsc.edu>
> > > To: <genome at soe.ucsc.edu>; <maxim.shklar at weizmann.ac.il>
> > > Sent: Tuesday, December 16, 2003 1:35 AM
> > > Subject: Re: [Genome] Obtaining EST transcription orientation info
> > >
> > >
> > > >
> > > > Maxim,
> > > >
> > > > The sign of the number in the intronOrientation field of the estOrientInfo
> > > > table gives the orientation of the =relative to the EST=. The magnitude
> > > > of the number gives the number of introns in that direction minus the
> > > > number of introns that calculated out in the opposite direction.
> > > >
> > > > To get the whole story, you need to know to which strand the EST aligns.
> > > > An example is the first two ESTs in the table for chr22:. They both
> > > > appear in the same browser window with chevrons in the negative
> > > > direction, yet they opposite sign in the intronOrientation field.
> > > > They appear in hg16 (July 2004 freeze) at:
> > > >
> > > > chr22:25,164,748-25,178,438
> > > >
> > > > BQ185398 aligns on the + strand. It has a value of -1 in the
> > > > orientation table. So it is oriented on the - strand.
> > > >
> > > > BQ189070 aligns on the - strand. It has a value of +1 in the
> > > > orientation table. So it is oriented on the - strand.
> > > >
> > > >
> > > > --b0b kuhn
> > > >
> > > >
> > > > > From genome-bounces at soe.ucsc.edu Mon Dec 15 08:49:40 2003
> > > > > To: <genome at soe.ucsc.edu>
> > > > > Subject: [Genome] Obtaining EST transcription orientation info
> > > > >
> > > > > Hi,
> > > > >
> > > > > I wanted to know how you generated the EST transcription orientation
> > > (not the alignment direction) data, and how could I download this
> > > information for all ESTs.
> > > > >
> > > > > Is this the information contained in the 'estOrientInfo' table, and if
> > > so how do I interpert the numbers that appear in the intronOrientation
> > > field.
> > > > >
> > > > > Thank you,
> > > > >
> > > > > Maxim Shklar
> > > > > GeneCards team
> > > > > Department of Molecular Genetics, Levine 109
> > > > > Weizmann Institute of Science
> > > > > maxim.shklar at weizmann.ac.il
> > > > > Tel: 08-9344406 053-382014 , Fax: 08-9344113
> > > > > _______________________________________________
> > > > > Genome maillist - Genome at soe.ucsc.edu
> > > > > http://www.soe.ucsc.edu/mailman/listinfo/genome
> > > > >
> > >
> >
More information about the Genome
mailing list