Jump to main content

Summary

Areas of Science
Difficulty
 
Time Required
Average (6-10 days)
Prerequisites
None
Material Availability
Readily available
Cost
Low ($20 - $50)
Safety
No issues
Credits

David B. Whyte, PhD, Science Buddies
Edited: Svenja Lohner, PhD, Science Buddies

Abstract

Believe it or not, scientists were recently able to recover tissue from a 68-million-year-old Tyrannosaurus rex fossil! Not only were they able to purify non-mineralized tissue, but they also succeeded in obtaining partial sequence information for protein molecules in the T. rex tissue. In this genomics science fair project, you will use the T. rex's protein sequence to search sequence databases for the its closest living relatives.
Share your story
I did this project Yes, I Did This Project! Please log in (or create a free account) to let us know how things went.

Objective

The objective of this genomics science fair project is to determine the closest living relative to the mighty Tyrannosaurus rex, using simple bioinformatics tools.

Introduction

Have you ever noticed that birds have scales on their feet? The reason they have scales is that, technically speaking, they are reptiles, and reptiles have scales. What about the feathers? Feathers are produced by tissues similar to those that produce scales. Also, birds lay eggs like other reptiles. Not only are birds considered reptiles, but scientists now generally agree that birds are, in fact, dinosaurs. Specifically, birds are members of the clade Maniraptora (a clade is a group of animals related by descent from a common ancestor). Maniraptorans all have shared skeletal features, including bone structures in the wrist and forelimb that were first used for grasping, but that were modified into wings during the evolution of birds.

The Maniraptora is the group of theropod dinosaurs. The major Maniraptoran groups include:

It is important to note that birds are not descended from velociraptors or any of the other maniraptorans. They are all derived from a common ancestor. Birds split from the other members of the group about 150 or so million years ago, in the Jurassic period. The non-avian dinosaurs became extinct over 65 million years ago, but the birds have flourished.

The evidence that birds are dinosaurs is based on detailed studies of fossils, as well as the biology of modern birds. Recently, a new avenue of analysis became available with the extraction of tissue from dinosaur bones. Dr. John Asura, and other scientists, published an account of their analysis of collagen proteins purified from bones of a Tyrannosaurus rex (T. rex) in the journal Science, which you can find in the Bibliography section below. They were able to obtain partial sequence information from the T. rex collagen proteins. Although the protein sequence they obtained is not complete (see the Procedure section for the actual sequence), it has enough information to allow searching of sequence databases.

Model head of a Tyrannosaurus rex
Figure 1. T. rex head reconstruction at the Oxford University Museum of Natural History. (Wikipedia, 2006.)

BLAST is a program used to search databases of sequence information. For this science fair project, you will search SwissProt, a database of protein sequences. Each record has the protein sequence, as well as the authors who submitted the sequence, the article associated with the sequence, and other information.

In the Procedure, you will use BLAST to search the SwissProt protein database for sequences related to the T. rex sequence. If two organisms are descended from a recent common ancestor, their protein sequences will be similar. For example, the collagen genes in two species that split 1 million years ago will have fewer differences than two species that split 10 million years ago. This is because DNA accumulates mutations over time. If the rate at which mutations accumulates is constant, the number of mutations is proportional to the time since the species split. The mutations that are accumulated over time are useful in phylogenetic analysis. You might ask, what is phylogenetic analysis and how are mutations used? Phylogenetic analysis simply means the study of evolutionary relationships among organisms. Mutations play a role in phylogenetic analysis because they are ideal "tags" for lines of descent. The presence of a particular mutation within a group of animals is evidence of a common ancestor. Most mutations are not passed down to subsequent generations, but some do become common within the population. Based on the differences in DNA or protein sequences, one can create a phylogenetic tree, which is a tree-like diagram showing the evolutionary relationships among different biological species based upon similarities and differences in their genetic characteristics. In other words, you can use protein or DNA sequence comparisons to establish how animals are related to each other. You can watch the two videos below to learn more about phylogenetics and reading phylogenetic trees.

Phylogenetics and Reading Phylogenetic Trees
Phylogenetic analysis of pathogens (lecture - part 1)

Using BLAST, and publicly available databases, you can perform your own genomics science fair project, studying the evolutionary relationships of various animals. Now that the database contains sequence information for T. rex, you have the tools needed to investigate which of the organisms represented in the SwissProt database is most related to this extinct dinosaur.

Terms and Concepts

Questions

Bibliography

These websites offer more information about dinosaurs, specifically those discussed in this science fair project:

These websites are useful resources for understanding how DNA can be used to build evolutionary trees and the bioinformatics tools used to do this:

Materials and Equipment

Experimental Procedure

The procedure for this genomics science fair project has two sections: 1) Use BLAST to search SwissProt (an online database of protein sequences) for the best match to Tyrannosaurus rex sequence data (a query sequence is used to search the database), and 2) Build a tree graphically showing the relationship of T. rex to its living relatives.

Before you start with this project, it might be helpful to familiarize yourself with the bioinformatic tools that you are going to use. You can watch the video below to learn more about the BLAST tool on the NCBI website.

How to Use BLAST for Finding and Aligning DNA or Protein Sequences

Use BLAST to Search SwissProt

  1. The partial sequence for the Tyrannosaurus rex collagen protein is pasted below. It is from the Science article by Asara, listed in the Bibliography at the end of the Background section. Regions where the protein sequence is not known have a hyphen (–) to represent a gap of indeterminate length. The capital letters each represent an amino acid in the protein sequence. Note that most of the protein was not successfully sequenced, but considering that the tissue was 68 million years old, it is remarkable any sequence was obtained. The protein sequence is in FASTA format, which means that the sequences are preceded by a header line that starts with a ">" and ends with a return, or a new paragraph. The FASTA format is the standard formatting used by bioinformatics software.
    >Tyrannosaurus rex, collagen type I, alpha 1 
    -GATGAPGIAGAPGFPGARGAPGPQGPSGAPGPK-GVQGPPGPQGPR-GSAGPPGATGFPGAAGR-GVVGLPGQR-GLPGESGAVGPAGPPGSR-
    
  1. Copy the sequence of the T. rex collagen protein above, including the header line (> Tyrannosaurus rex) and all of the hyphens.
  2. Open a BLAST page at the National Center for Biotechnology Information (NCBI).
    1. Go to the NCBI main page.
    2. Click on the BLAST link in the "Popular Resources" list on the right to get to the BLAST page
    3. There are several versions of BLAST. Since you want to use a protein sequence to search a protein database, click on "Protein BLAST" under the "Web BLAST" heading.
  3. Fill out the protein BLAST query form, which should look similar to Figure 2, below, when you are done.
    1. Paste the T. rex sequence into the "Enter Query Sequence" box.
    2. For job title, use "Tyrannosaurus rex, collagen type 1, alpha... ."
      1. If you kept the header line (>Tyrannosaurus rex) at the top of the sequence, it will be added here automatically.
    3. Under "Database," choose the "UniProtKB/Swiss-Prot(swissprot)" protein database.
    4. Leave the box for "Organism" empty.
      1. If you would like to compare the genomes of animals other than the default ones, you could add them here by entering their names.
    5. Under "Algorithm," select blastp (protein-protein BLAST).
    6. Next to the BLAST button, check the box "show results in a new window."
    7. Click on "Algorithm Parameters" underneath the BLAST button. In the "General Parameters" section, select 10 for the Max target sequences. This will limit your search to the closest 10 protein sequences and simplifies your phylogenetic tree in the following step.
      1. You can expand your search to 50 or more target sequences later and also explore other BLAST options in the "Algorithm parameters" section.
    8. Then click on BLAST to start the search.

    Screenshot of a nucleotide BLAST search box on the website ncbi.nlm.nih.gov

    Screenshot of the search page on the ncbi.nlm.nih.gov website. At the top of the BLAST query search page there is a text box where users can fill in search terms. Other options are available under the search box that allow for different databases to be searched and to limit searches through keywords or IDs.


    Figure 2. Protein BLAST (blastp) query input page. Your query page should look similar to this one after you have filled it in by following step 4, above.

    1. Be patient. It will take a few minutes for the BLAST results to appear. See Figure 3, below, for a snapshot of how the results page should look like.
    2. On the top left of the BLAST results page you will find the summary section (blue in Figure 3), which provides information on different aspects of your search. On the top right there is a box that allows you to filter your results based on certain criteria (red in Figure 3). Below the top section, the BLAST results are shown (yellow in Figure 3). There are four different tabs called "Description," Graphic Summary," "Alignments," and "Taxonomy." Each tab presents the search results in a different way.
      1. The "Description" tab contains a summary table of hits found by BLAST and is the default tab shown.
      2. The "Graphic Summary" tab shows a color key of the alignments. The color key shows the degree of similarity for the sequences.
      3. The "Alignment" section contains the detailed pairwise alignments between query and database sequences.
      4. The "Taxonomy" section provides details of the taxonomic distribution of matches BLAST found.
    3. Review each of the four tabs. Based on the information provided, can you tell what living organism is most related to T. rex, based on similarity of collagen genes?
      1. Scroll down the the result list in the "Description" tab. Note the "Max Score" column. Proteins with the highest scores are most related to the T. rex query sequence. The E value is an estimate of the chance that the sequences are not related. The smaller the E value, the more certain the sequences are related. You will also find the scientific names of each species that matched to the T. rex collagen sequence in the "Scientific Name" column.
      2. In the"Alignments" tab you will find the alignments of the T. rex amino acid sequence with the sequences in the database. Note the "Identities" value, which is the percent of amino acids that are the same in the query and the database sequence. "Positives" measures the percent of amino acids that remain the same or that were changed into similar amino acids. If the % identity between two species is 97%, then these two species differ by 3% in the protein sequence. Remember, the larger the % difference, the more distant they are in the family tree.
    4. Make a data table based on the BLAST output. List the organism's scientific name, common name, the score, % identity, and the E value. View each of the four result tabs to find the information you need.

Screenshot of search results for a BLAST query on the website ncbi.nlm.nih.gov

Screenshot of the results page in the BLAST tool on the ncbi.nlm.nih.gov website shows a list of protein sequences that match a search term. Results provide additional information such as the percentage match a result has to the specific query string that was searched for.


Figure 3. Snapshot of the BLAST outout page.

Make a Phylogenetic Tree

In this section, you will use the BLAST output to make a tree that graphically depicts the degree of similarity of the proteins. There are more sophisticated ways to generate a phylogenetic tree, which you can explore in the variations in the Make It Your Own section.

  1. To generate a tree of the BLAST results, click on "Distance tree of results" next to "Other reports" above the results table in the summary section of your BLAST report. This tree includes all of the hits from the BLAST search.
    1. The tree will be displayed in "rectangular" format.
    2. Your query sequence will be higlighted in yellow.
  2. Keep all the default setting for the tree parameters but change the "Sequence Label" to "Taxonomic Name" in the drop-down list.
  3. Add the common name for each species to the tree.
    1. This might be easier if you redraw the tree by hand or use a computer graphics program.
  4. Add BLAST data, such as the % identity to the tree.
  5. What does the phylogenetic tree tell you about the closest living relatives of T. rex? If you need help reading the phylogenetic tree, you can view the Phylogenetics and Reading Phylogenetic Trees or the Phylogenetic analysis of pathogens video.
icon scientific method

Ask an Expert

Do you have specific questions about your science project? Our team of volunteer scientists can help. Our Experts won't do the work for you, but they will make suggestions, offer guidance, and help you troubleshoot.

Variations

  • Build a phylogenetic tree based on the sequences used in the original paper by Asara, et al.
    1. First, download the sequences for the collagen genes that were used in the Asara paper, Molecular Phylogenetics of Mastodon and Tyrannosaurus rex.
    2. Open the ClustalW2 tool at the EBI Bioinformatics site (also referenced in the Bibliography).
    3. Paste the sequences into the entry box and click "Run."
    4. After the alignment is complete, click on "Jalview" for a set of tools for making various trees.
  • Add pictures of the animals to the appropriate branches of the phylogenetic tree.

Careers

If you like this project, you might enjoy exploring these related careers:

Career Profile
Many aspects of peoples' daily lives can be summarized using data, from what is the most popular new video game to where people like to go for a summer vacation. Data scientists (sometimes called data analysts) are experts at organizing and analyzing large sets of data (often called "big data"). By doing this, data scientists make conclusions that help other people or companies. For example, data scientists could help a video game company make a more profitable video game based on players'… Read more
Career Profile
The human body can be viewed as a machine made up of complex processes. Scientists are working on figuring out how these processes work and on sequencing and correlating the sections of the genome that correspond to the individual processes. (The genome is an organism's complete set of genetic material.) In the course of doing so, they generate large amounts of data. So large, in fact, that to make sense of it, the data must be organized into databases and labeled. This is where bioinformatics… Read more
Career Profile
Growing, aging, digesting—all of these are examples of chemical processes performed by living organisms. Biochemists study how these types of chemical actions happen in cells and tissues, and monitor what effects new substances, like food additives and medicines, have on living organisms. Read more

News Feed on This Topic

 
, ,

Cite This Page

General citation information is provided here. Be sure to check the formatting, including capitalization, for the method you are using and update your citation, as needed.

MLA Style

Science Buddies Staff. "BLAST into the Past to Identify T. Rex's Closest Living Relative." Science Buddies, 9 Oct. 2021, https://www.sciencebuddies.org/science-fair-projects/project-ideas/Genom_p018/genetics-genomics/t-rex-closest-living-relative. Accessed 26 Jan. 2022.

APA Style

Science Buddies Staff. (2021, October 9). BLAST into the Past to Identify T. Rex's Closest Living Relative. Retrieved from https://www.sciencebuddies.org/science-fair-projects/project-ideas/Genom_p018/genetics-genomics/t-rex-closest-living-relative


Last edit date: 2021-10-09
Top
Free science fair projects.