Page 1 of 1

NCBI and computational biology

Posted: Mon Jan 06, 2014 1:50 pm
by annettechu
How can you tell if you have found the complete protein or mRNA sequence for a gene from the NCBI website? I been searching for a while now and each hit has sequences that vary in length and I don't know which one to use.

Re: NCBI and computational biology

Posted: Wed Jan 08, 2014 5:37 pm
by deleted-183262
NCBI can often be very confusing when trying to find sequences. The reason there are several different lengths listed for a protein or mRNA is that there is often different forms (due to alternative mRNA splicing) of each protein/mRNA. But in many cases, there is one form of the protein that is the most common within the cell.

In order to find the "standard" protein sequence, I prefer to use Uniprot.org instead of NCBI. If you search for your protein of interest and select the appropriate organism, there is a "Sequences" section about halfway down the page. It may list multiple isoforms of your protein, but the first isoform listed is typically the "standard" sequence. Finding the corresponding mRNA sequence is quite a bit trickier, but the NCBI mRNA sequences usually list the translation of each mRNA. If the translation of the mRNA matches the protein sequence you found at Uniprot.org, then that's probably the right one.

Good Luck!

Caleb