[Genome] position discrepancy between ucsc and ncbi

Donna Karolchik donnak at soe.ucsc.edu
Tue Sep 13 08:28:50 PDT 2005


hi Yu,

In placing RefSeq alignments on the browser, we use several filtering
criteria. These are outlined on the RefSeq Genes track description page:

"RefSeq mRNAs were aligned against the human genome using blat; those with
an alignment of less than 15% were discarded. When a single mRNA aligned in
multiple places, the alignment having the highest base identity was
identified. Only alignments having a base identity level within 0.1% of the
best and at least 96% base identity with the genomic sequence were kept."

In many cases, RefSeq mRNAs will align in multiple places while still
meeting the above criteria. In these cases, we make no judgments about which
position is correct, but rather show all the multiple alignments in the
browser. It is then up to the user to make a judgment about which location
looks correct, using other evidence provided by the Genome Browser and
outside sources. If NCBI shows the known mapping location of the
gene as chr2, that certainly provides supporting evidence that the chr2 
position
may be the correct one. You may also want to open up the Known Genes track, 
which incorporates protein data from UniProt in addition to RefSeq mRNA 
data.

In the case of multiple alignments, there is a possibility that you are 
viewing a recent gene duplication event, but there is also a possibility 
that you are viewing a sequencing artifact (in the human genome, this is 
generally more likely if one of the duplicates aligns to one of the 
chrN_random chromosomes).

See our FAQ for some more suggestions on how to evaluate duplicate gene 
copies on different chromosomes: 
http://genome.ucsc.edu/FAQ/FAQtracks#tracks9.

-Donna
-----------------------------------
Donna Karolchik
UCSC Genome Bioinformatics Group
http://genome.ucsc.edu

----- Original Message ----- 
From: "Yu Huang" <yuhuang at usc.edu>
To: <genome at soe.ucsc.edu>
Sent: Sunday, September 11, 2005 2:39 PM
Subject: Re: [Genome] position discrepancy between ucsc and ncbi


> Yu Huang wrote:
>
>>Hi,
>>
>>I found for a nucleotide sequence in NCBI, its corresponding NCBI Gene
>>Id(locuslink) usually shows one position on chromosome. But ucsc shows
>>
>>
>>>=2 positions. Like NM_198943, NCBI tells me it's on chromosome 2. UCSC
>>>
>>>
>>tells me it's on chromosome 1,2 and 15.  After i checked the upstream
>>sequence or 5UTR, they show almost same.
>>
>>
> One more question, if one gene is found to appear in multiple places in
> the whole genome. Are their upstream sequences (including UTR) almost
> same? In the examples i checked, the UTR is same, upstream shows a
> little difference.
>
>
> Thanks,
> Yu
> _______________________________________________
> Genome maillist  -  Genome at soe.ucsc.edu
> http://www.soe.ucsc.edu/mailman/listinfo/genome
> 



More information about the Genome mailing list