Abstract
Have you ever tried to pack a suitcase? If so, you know that no matter how hard you try, there is a limit to the amount you can cram in, which means if you have more stuff, you need a bigger suitcase! Do you think the same principle applies to DNA in a cell? Does an animal with a bigger genome need a larger cell nucleus to store its DNA? Try this science project and find out!Summary
Sandra Slutz, PhD, Science Buddies
- Microsoft© is a registered trademark of Microsoft Corporation.
- Excel© is a registered trademark of Microsoft Corporation.

Objective
In this science project you will determine whether there is a correlation between an animal's genome size and cell nucleus size.
Introduction
Every living organism, whether it's a rose, a bacterium, a fly, or an elephant, has a set of blueprints telling its cells how to form the organism, how to make all the proteins it needs, and how to function. These blueprints are stored as deoxyribonucleic acid (DNA) in the organism's cells. There are four different DNA units, or nucleotide bases: adenine (A), thymine (T), guanine (G), and cytosine (C). A nearly infinite number of combinations can be made from repeating these four bases in sequences of different lengths, leading to very precise instructions on how to build organisms as different as a human and a jellyfish.
It takes a lot of nucleotide bases to set out all the instructions an organism needs. The total set of instructions for a human is over 3 billion nucleotides and would stretch almost 6 feet in length if you lined up all the bases! Yet all this DNA has to fit into a cell-almost every single cell in an organism contains the same set of DNA. To make it all fit, the DNA is tightly wound into structures called chromosomes and packaged into a membrane-bound compartment in the cell, called the nucleus. Many organisms are diploid, meaning they have two copies of every chromosome-one from their mother and one from their father. A single copy of all the chromosomes is referred to as the genome. For example, humans have 23 pairs of chromosomes (for a total of 46 individual chromosomes) and the human genome consists of a single copy of each of those 23 chromosomes.
Genome size is measured in terms of weight. The more DNA there is, the greater the weight, and the larger the genome. Genome size is often reported in a tiny unit of measurement called picograms (pg)-that's a trillionth of 1 gram. Picogram measurements of genome size are also referred to as an organism's C-value, and are used to compare the size of genomes across different species.
There is a large variation in genome size among different animals. However, genome size is not correlated with the complexity of the animal or even with the number of genes it has. Much of the variation in genome size is from sequences of DNA that do not code for proteins. Scientists are still trying to figure out why genome size varies so much. To give you an idea of the range in genome size, the smallest animal genome, belonging to the plant-parasitic nematode Pratylenchus coffeae, is 0.02 pg, while the largest animal genome, from the marble lungfish Protopterus aethiopicus, is just over 132 pg. That's more than a 6,000-fold difference in DNA content! That means the lungfish has to fit a lot more DNA into its nuclei than the nematode does. Does this lead to a change in the size of the nucleus? Is there a correlation between genome size and nucleus size? In this science project you'll use two databases to determine the answer. You'll record both the C-value, from the Animal Genome Size Database, and the nucleus size from the Cell Size Database, of different amphibians, birds, and fish, then graph the data to see if there is a relationship between genome size and nucleus size. You won't be able to include mammals in this comparison because the Cell Size Database uses erythrocyte (red blood cell) measurements, and mammals are an exception because their erythrocytes don't have nuclei. But the rest of the animal kingdom does, so let's get graphing!

Terms and Concepts
- Deoxyribonucleic acid (DNA)
- Nucleotide base (also called nucleotide)
- Adenine (A)
- Thymine (T)
- Guanine (G)
- Cytosine (C)
- Chromosome
- Nucleus
- Diploid
- Genome
- Picogram (pg)
- C-value
- Correlation
- Erythrocyte
Questions
- What is a genome?
- How is genome size measured?
- Where is DNA stored in animal cells?
- How do you determine if two variables are correlated?
Bibliography
You will use these two databases to gather the data for this science project.
- Gregory, T.R. (2003, October 16). Cell Size Database. Retrieved May 10, 2008.
- Gregory, T.R. (2008). Animal Genome Size Database. Retrieved May 10, 2008.
- GlaxoSmithKline. (2006). Kids Genetics. Retrieved May 10, 2008.
- National Human Genome Research Institute. (2007, December 27). A Brief Guide to Genomics. Retrieved May 10, 2008.
- National Center for Biotechnology Information. (2004, March 31). What Is a Genome?. Retrieved May 10, 2008.
These websites offer help with creating graphs and more information about correlation.
- National Center for Education Statistics. (n.d.). Create a Graph. Retrieved May 10, 2008.
- Olsen, A. (2006, April 20). Which Team Batting Statistic Predicts Run Production Best?. Retrieved June 12, 2008.
- Hunt, N., Tyrrell, S., and Nicholson, J. (2002, March 25). DISCUSS: Regression and Correlation. Retrieved May 10, 2008.
Materials and Equipment
- Computer with Internet access
- Lab notebook
- Graph paper
- Spreadsheet or statistics program, like Microsoft© Excel© (optional)
Experimental Procedure
-
To start this science project, gather information about the nucleus size of a variety of animals using the Cell Size Database. You will be using the nucleus area (NA) for the nucleus size data. These are tiny areas, millionths of a square meter (μm2), calculated from cell measurements made using microscopes.
-
Start by opening the "Amphibians" data set. Copy the category ("Amphibian"), scientific name, and nucleus area information for a minimum of 20 different amphibians into a data table in your lab notebook. You will be able to record the common name and C-value in the next step. Your data table should look similar to the one below.
- If you prefer, you can input the data directly into a spreadsheet program like Microsoft Excel and print a copy for your lab notebook when you are done.
- Important: Make sure to scroll through the full data set and choose amphibians that represent the whole range of measured nucleus areas. You can see they range from 18.10 μm2 to 412.71 μm2. If you choose from too small of a range you might be skewing your data and either creating a trend where there isn't one, or failing to see a trend that would be there if you looked at the entire range.
-
Start by opening the "Amphibians" data set. Copy the category ("Amphibian"), scientific name, and nucleus area information for a minimum of 20 different amphibians into a data table in your lab notebook. You will be able to record the common name and C-value in the next step. Your data table should look similar to the one below.
-
Once you have nucleus area data for at least 20 amphibians, find their C-values in the Animal Genome Size Database.
- From the database homepage, select "Search Data" from the options on the left- hand side of the screen.
-
Type the scientific name of the first amphibian on your data table into the "Species:" search box. Leave all other fields as is. Record the animal's common name and C-value in your data table.
- Occasionally, more than one C-value will be listed. This is because different groups of scientists have reported different experimental findings in genome size for the same organism. For this science project, take the average of all the C-values reported and use that average as the animal's C-value.
- The Animal Genome Size Database also lists the common name for all the animals; you can add this to your data chart too, so you have a better idea of what these amphibians are.
-
Not all animals listed in the Cell Size Database will be listed in the Animal Genome Size Database. You may not find data for all 20 amphibians on your list. Repeat steps 1 and 2 until you have both nucleus area and C-value data for a minimum of 15 amphibians. This may require a bit of patience!
Scientific Name Common Name Category (Amphibian, Bird, etc.) Nucleus Area (NA) (μm2) C-value (pg) -
Repeat steps 1-3 for 20 birds and 20 fish (or 15, if you cannot obtain data for 20).
- In the end, you should have a data table with at least 15 amphibians, 15 birds, and 15 fish, all with nucleus area and C-value measurements.
- If you have fewer than 45 animals total, you might have too few data points to determine if there is any correlation between nucleus size and genome size.
-
Analyze the data you've gathered by creating a scatterplot. This can be done by hand on graph paper, or on the computer using either a spreadsheet program like Microsoft Excel or a website like Create a Graph. Note: In both Microsoft Excel and Create a Graph, scatterplot is a sub-option found under XY-graphs.
- For each animal, place the C-value data on the x-axis and the nucleus area on the y-axis.
-
Draw a best-fit line on your graph. This is a line that best sums up your data.
- If you are drawing this line by hand, try to make it go through the middle of the cloud of data points such that most data points fall evenly on one side or the other. Don't worry if there are some extreme outliers that don't fit the best-fit line.
- If you are using Microsoft Excel, you can have the program add a best-fit line. Use the "Help" features in the program to find out how to do this. Note: Microsoft Excel refers to the best-fit line as a "trend line."
- Using the best-fit line, determine what your data tells you about the relationship between nucleus size and genome size. Is there any correlation? If so, is it a negative correlation (i.e., the two measurements move in opposite directions) or a positive correlation (i.e., the two measurements move in the same direction). If there is a correlation, would you describe it as strong or weak?
- Optional: The correlation between two variables can be described by statistics. Advanced students should calculate the Pearson correlation coefficient, r, for their data. If you need help with this calculation the Which Team Batting Statistic Predicts Run Production Best? science project has a step-by-step introduction to correlation and linear regression. Note: Excel will calculate r2 and the slope of the trend line; you can back-calculate r from r2.

Ask an Expert
Variations
- In the science project above you looked for a correlation between nucleus size and genome size. If you use cell area rather than nucleus area, do you get the same result? What does this tell you about the relationship between cell size and nucleus size? Can you use data from the Cell Size Database to directly test your hypothesis about the relationship between nucleus size and cell size?
- Mammal erythrocytes do not have nuclei, but is there still a relationship between the size of the erythrocytes and the genome size? Mean corpuscular volume (MCV), is a measurement of the volume of a cell. Use the MCV data for mammals, amphibians, and fish from the Cell Size Database to determine if cell size and genome size follow the same trend in mammals as they do in other vertebrates.
Careers
If you like this project, you might enjoy exploring these related careers:
Related Links
- Science Fair Project Guide
- Other Ideas Like This
- Genetics & Genomics Project Ideas
- My Favorites
- Data Analysis & Graphs