BLAST into the Past to Identify T. Rex's Closest Living Relative
|Areas of Science||
Genetics & Genomics
|Time Required||Average (6-10 days)|
|Material Availability||Readily available|
|Cost||Low ($20 - $50)|
AbstractBelieve it or not, scientists were recently able to recover tissue from a 68-million-year-old Tyrannosaurus rex fossil! Not only were they able to purify non-mineralized tissue, but they also succeeded in obtaining partial sequence information for protein molecules in the T. rex tissue. In this genomics science fair project, you will use the T. rex's protein sequence to search sequence databases for the its closest living relatives.
The objective of this genomics science fair project is to determine the closest living relative to the mighty Tyrannosaurus rex, using simple bioinformatics tools.
David B. Whyte, PhD, Science Buddies
Cite This PageGeneral citation information is provided here. Be sure to check the formatting, including capitalization, for the method you are using and update your citation, as needed.
Last edit date: 2020-06-23
Have you ever noticed that birds have scales on their feet? The reason they have scales is that, technically speaking, they are reptiles, and reptiles have scales. What about the feathers? Feathers are produced by tissues similar to those that produce scales. Also, birds lay eggs like other reptiles. Not only are birds considered reptiles, but scientists now generally agree that birds are, in fact, dinosaurs. Specifically, birds are members of the clade Maniraptora (a clade is a group of animals related by descent from a common ancestor). Maniraptorans all have shared skeletal features, including bone structures in the wrist and forelimb that were first used for grasping, but that were modified into wings during the evolution of birds.
The Maniraptora is the group of theropod dinosaurs. The major Maniraptoran groups include:
- Aves: The birds, living dinosaurs.
- Dromaeosaurs: The "raptors," including velociraptor, made famous in the movie Jurassic Park.
- Troodontids: Non-avian dinosaurs thought by some to be particularly intelligent.
- Therizinosaurs: Plant-eating theropods.
- Oviraptors: The fossil record contains evidence that these dinosaurs were devoted parents.
It is important to note that birds are not descended from velociraptors or any of the other maniraptorans. They are all derived from a common ancestor. Birds split from the other members of the group about 150 or so million years ago, in the Jurassic period. The non-avian dinosaurs became extinct over 65 million years ago, but the birds have flourished.
The evidence that birds are dinosaurs is based on detailed studies of fossils, as well as the biology of modern birds. Recently, a new avenue of analysis became available with the extraction of tissue from dinosaur bones. Dr. John Asura, and other scientists, published an account of their analysis of collagen proteins purified from bones of a Tyrannosaurus rex (T. rex) in the journal Science, which you can find in the Bibliography, below. They were able to obtain partial sequence information from the T. rex collagen proteins. Although the protein sequence they obtained is not complete (see the Experimental Procedure, below, for the actual sequence), it is has enough information to allow searching of sequence databases.
Figure 1. T. rex head reconstruction at the Oxford University Museum of Natural History. (Wikipedia, 2006.)
BLAST is a program used to search databases of sequence information. For this science fair project, you will search SwissProt, a database of protein sequences. Each record has the protein sequence, as well as the authors who submitted the sequence, the article associated with the sequence, and other information.
In the Experimental Procedure, you will use BLAST to search the SwissProt protein database for sequences related to the T. rex sequence. If two organisms are descended from a recent common ancestor, their protein sequences will be similar. For example, the collagen genes in two species that split 1 million years ago will have fewer differences than two species that split 10 million years ago. This is because DNA accumulates mutations over time. If the rate at which mutations accumulates is constant, the number of mutations is proportional to the time since the species split. In other words, you can use protein or DNA sequence comparisons to establish how animals are related to each other.
Using BLAST, and publicly available databases, you can perform your own genomics science fair project, studying the evolutionary relationships of various animals. Now that the database contains sequence information for T. rex, you have the tools needed to investigate which of the organisms represented in the SwissProt database is most related to this extinct dinosaur.
Terms and Concepts
- Collagen proteins
- Fasta format
- Phylogenetic tree
- What does the acronym BLAST stand for?
- Based on your research, draw a family tree that includes birds, dinosaurs, reptiles, and mammals.
- What dinosaurs have been found to have feathers?
These websites offer more information about dinosaurs, specifically those discussed in this science fair project:
Asara, J.M., et al. (2008, April 25). Molecular Phylogenetics of Mastodon and Tyrannosaurus rex. Science, Vol. 320., No. 5875, p. 499. Retrieved August 25, 2008, from the National Center for Biotechnology Information website: http://www.ncbi.nlm.nih.gov/pubmed/18436782
- PDF copies of the article and the supplementary data are available here: Genom_p018_Tyrannosaurus_Genomics_Article.pdf and Genom_p018_Tyrannosaurus_Genomics_Supplementary_Data.pdf.
- Vergano, D. (2007, April 12). Yesterday's T. Rex is today's chicken. USA Today. Retrieved August 25, 2008.
- DinoBuzz, University of California Museum of Paleontology. (n.d.). Are birds really dinosaurs?. Retrieved August 25, 2008.
- DinoBuzz, University of California Museum of Paleontology. (n.d.). Maniraptora. Retrieved August 25, 2008.
These websites are useful resources for understanding how DNA can be used to build evolutionary trees and the bioinformatics tools used to do this:
News Feed on This Topic
Materials and Equipment
- Computer with access to the Internet
- Lab notebook
The procedure for this genomics science fair project has two sections: 1) Use BLAST to search SwissProt (an online database of protein sequences) for the best match to Tyrannosaurus rex sequence data (a query sequence is used to search the database), and 2) Build a tree graphically showing the relationship of T. rex to its living relatives.
Use BLAST to Search SwissProt
The partial sequence for the Tyrannosaurus rex collagen protein is pasted below. It is from the Science article by Asara, listed in the Bibliography at the end of the Background section. Regions where the protein sequence is not known have a hyphen (–) to represent a gap of indeterminate length. The capital letters each represent an amino acid in the protein sequence. Note that most of the protein was not successfully sequenced, but considering that the tissue was 68 million years old, it is remarkable any sequence was obtained. The protein sequence is in FASTA format, which means that the sequences are preceded by a header line that starts with a ">" and ends with a return, or a new paragraph. The FASTA format is the standard formatting used by bioinformatics software.
>Tyrannosaurus rex, collagen type I, alpha 1 -GATGAPGIAGAPGFPGARGAPGPQGPSGAPGPK-GVQGPPGPQGPR-GSAGPPGATGFPGAAGR-GVVGLPGQR-GLPGESGAVGPAGPPGSR-
- Copy the sequence of the T. rex collagen protein above, including the header line (> Tyrannosaurus rex) and all of the hyphens.
- Open a BLAST page at the National Center for Biotechnology Information (NCBI).
- Go to the NCBI main page: https://www.ncbi.nlm.nih.gov/
- Click on BLAST to go to the BLAST page.
- Under "Basic BLAST," click on "protein BLAST."
- Fill out the protein BLAST query form, which should look similar to Figure 2, below, when you are done.
- Paste the T. rex sequence into the "Enter Query Sequence" box.
- For job title, use "Tyrannosaurus rex, collagen type 1, alpha... ."
- If you kept the header line (>Tyrannosaurus rex) at the top of the sequence, it will be added here automatically.
- Under "Database," choose the SwissProt protein database.
- Leave the box for "Organism" empty.
- If you would like to compare the genomes of animals other than the default ones, you could add them here by entering their names.
- In the box for "Entrez Query," type "COL1A1 [GENE] AND 1400:1500 [SLEN]".
- COL1A1 [GENE]: This limits the search to collagen type 1, alpha 1, which is the collagen type for the T. rex sequence. Otherwise, the BLAST result lists hits with other collagen types.
- 1400:1500 [SLEN]: This limits the BLAST output to sequences between 1400 and 1500 amino acids. Otherwise, BLAST finds small partial sequences.
- Under "Algorithm," select blastp (protein-protein BLAST).
- Click on the BLAST button on the bottom of the page to begin the search.
- If you would like to explore BLAST options, click on "Algorithm Parameters."
Screenshot of the search page on the ncbi.nlm.nih.gov website. At the top of the BLAST query search page there is a text box where users can fill in search terms. Other options are available under the search box that allow for different databases to be searched and to limit searches through keywords or IDs.
Figure 2. Protein BLAST (blastp) query input page. Your query page should look similar to this one after you have filled it in by following step 4, above.
- Be patient. It will take a few minutes for the screen to appear. The BLAST results page will appear when the search is complete. See Figure 3, below, for a snapshot of the top of the output page.
- Scroll down to see the sequences from other genomes that produce significant alignments with the T. rex query sequence. What organism is most related to T. rex, based on similarity of collagen genes?
- Note: As more sequences are added to the database, the list of hits will increase.
Screenshot of the results page in the BLAST tool on the ncbi.nlm.nih.gov website shows a visual summary of protein sequences that match a search term. Results are listed at the bottom of the page and provide additional information such as the percentage match a result has to the specific query string that was searched for.
Figure 3. Top of the BLAST output page. Scroll down to see the sequences from other genomes that produce significant alignments (in the "Descriptions" section). Note the "Score" columns. Genes with the highest scores are most related to the T. rex query sequence. The E value is an estimate of the chance that the sequences are not related. The smaller the E value, the more certain the sequences are related. Below the list of hits are the alignments of the T. rex amino acid sequence with the sequences in the database. Note the "Identities" value, which is the percent of amino acids that are the same in the query and the database sequence. "Positives" measures the percent of amino acids that remain the same or that were changed into similar amino acids.
- Make a data table based on the BLAST output. List the organism's scientific name, common name, the score, and the E value.
Make a Phylogenetic Tree
In this section, you will use the BLAST output to make a tree that graphically depicts the degree of similarity of the proteins. There are more sophisticated ways to generate a phylogenetic tree, which you can explore in the variations in the Make It Your Own section.
- Click on "Distance tree of results" to generate a tree of the BLAST results.
- This will create a tree in "rectangular" format.
- Select "Fast minimum evolution" for tree method.
- Leave "Max sequence difference" at the default value of 0.85.
- Click on "Grishin General (protein)" for the "Distance" parameter.
- Click on the "Force" tab to create a certain type of tree.
- For "Sequence Label," select "Taxonomic Name."
- Add the common name for each species to the tree.
- This might be easier if you redraw the tree by hand or use a computer graphics program.
- Add BLAST data, such as the % identity, to the tree.
If you like this project, you might enjoy exploring these related careers:
Build a phylogenetic tree based on the sequences used in the original paper by Asara, et al.
- First, download the sequences for the collagen genes that were used in the Asara paper, from http://www.sciencebuddies.org/science-fair-projects/project_ideas/Genom_p018_Tyrannosaurus_Genomics_Supplementary_Data.pdf.
- Open the ClustalW2 tool at the EBI Bioinformatics site (also referenced in the Bibliography): https://www.ebi.ac.uk/Tools/msa/clustalw2/.
- Paste the sequences into the entry box and click "Run."
- After the alignment is complete, click on "Jalview" for a set of tools for making various trees.
- Add pictures of the animals to the appropriate branches of the phylogenetic tree.
Ask an ExpertThe Ask an Expert Forum is intended to be a place where students can go to find answers to science questions that they have been unable to find using other resources. If you have specific questions about your science fair project or science fair, our team of volunteer scientists can help. Our Experts won't do the work for you, but they will make suggestions, offer guidance, and help you troubleshoot.
Ask an Expert
News Feed on This Topic
Looking for more science fun?
Try one of our science activities for quick, anytime science explorations. The perfect thing to liven up a rainy day, school vacation, or moment of boredom.Find an Activity
Explore Our Science Videos
DIY Toy Sailboat
Make A Tissue Paper Parachute - STEM Activity
Stretchy Balloons! Fun STEM Activity