Areas of Science Genetics & Genomics
Time Required Very Short (≤ 1 day)
Prerequisites You should have studied Life Science
Material Availability None required
Cost Very Low (under $20)
Safety No issues


This is a project about the "molecular alphabet" of DNA. With just four "letters," it manages to keep track of the plan for an entire person, and keep a complete copy in nearly every cell. This project will help you start learning this new alphabet.


The goal of this project is to learn the basics about DNA sequences by examining some simple differences between groups of genes.

Share your story with Science Buddies!

I did this project Yes, I Did This Project! Please log in (or create a free account) to let us know how things went.


Author: Shelley Force Aldred, Department of Genetics, Stanford University
Sponsor: Molecular Sciences Institute (MSI), Berkeley, California
Management & Editing: Ken Hess, The Kenneth Lafferty Hess Family Charitable Foundation

Cite This Page

General citation information is provided here. Be sure to check the formatting, including capitalization, for the method you are using and update your citation, as needed.

MLA Style

Science Buddies Staff. "Learning Your A, G, C's (and T, too)." Science Buddies, 20 Nov. 2020, Accessed 27 Nov. 2021.

APA Style

Science Buddies Staff. (2020, November 20). Learning Your A, G, C's (and T, too). Retrieved from

Last edit date: 2020-11-20


DNA is double stranded (a double helix) and made up of base PAIRS. Adenine on one strand (represented with an "A") always pairs with thymine (represented with a "T") on the other strand. These are called A~T pairs regardless of which strand has the "A" and which the "T." Similarly, cytosine on one strand (represented with a "C") always pairs with guanine (represented with a "G") on the other strand, creating G~C pairs.

Scientists often represent DNA strands with a string of letters like this:


This string of letters represents only one strand, or one half of the DNA molecule. There is no need to write down the other strand because as we just described above, a "G" in one strand means there is automatically a "C" in the other strand, just as a "C" in one strand implies that the other contains a "G".

Now think of the human genome and all of the genes in it as a VERY large set of blueprints. Each blueprint is an instruction set for assembling one part or piece of a cell. Almost every cell in your body carries the same set of blueprints -- so what makes a cell in your brain different from a cell in your stomach? A neuron and a stomach lining cell are very different in their morphologies (how they look) and their functions (what job they do). The different shapes and functions are a result of the fact that those two cells use different portions of the complete blueprint set to construct themselves. The neuron uses the blueprints for parts involved in brain signaling, while the stomach cell does not. The stomach cell makes parts for secreting stomach enzymes to help in food digestion, while the brain cell does not. These kinds of blueprints (or genes) are often called "tissue specific" because they are used in some body parts and not in others. However, there are also some blueprints that are used in every cell in the body because the parts they represent are needed in every cell (like pieces used during cell division or making energy). These kinds of blueprints (or genes) are often called "housekeeping" because they represent a basic need of every cell and they "keep up" the basic functions of the cell.

How does the cell know which blueprints to use? Each gene (or blueprint) is has its own control panel that acts as a group of switches affecting when (during an organism's development), where (in the body) and how much a particular blueprint is used. Scientists are still working hard at being able to identify all of the pieces of a gene's control panel. One important part of the control panel that we know a lot about is the "promoter." The number and pattern of As, Ts, Gs and Cs in a promoter is important in determining whether the switch will act like a "housekeeping" switch or a "tissue specific" switch. As of today, scientists are just beginning to understand why this is true.

Terms and Concepts

  • DNA, gene
  • Nucleotide bases (adenine, thymine, guanine, cytosine)


Background knowledge/info:

News Feed on This Topic

, ,

Materials and Equipment

  • Computer with Internet connection
  • Lab notebook

Experimental Procedure

In this experiment, you will compare housekeeping promoters to other genes by calculating the percentage of G~C content. This will make sense in a minute!

How to calculate the G~C pair content of a DNA sequence:

  1. Count the total number of G's and C's. Try it for this sample:

    You should get a total count of 15 (9 G's and 6 C's).

  2. Now count the total number of letters (bases). You should get 44.

  3. %G~C pair content = (Count of G's and C's / Total count of all bases)*100

    So, for our sample, the %G~C pair content = (15/44)*100 or 34%

Now, let's perform the experiment:

Step 1: Formulate your hypothesis. Which do you think might be true?

  1. Housekeeping promoters will have lower G~C content than other genes,
  2. Housekeeping promoters will have higher G~C content than other genes, or
  3. Housekeeping promoters will have similar G~C content to other genes.

Step 2: Calculate the %G~C content for each sequence listed below on this page. We have provided partial DNA sequences for three housekeeping promoters, three tissue specific promoters, and for comparison, a number of additional genes of the sort that the promoters would regulate and control.

Time saver! Do the first couple sequences "by hand," but then you can use the %G~C Content Calculator. To use the calculator:

  1. Copy and paste the DNA sequence you want to analyze into the box.
  2. Press "Calculate" to count the bases and determine the %G~C content.
  3. Record your results.
  4. Press "Clear Form" to clear all the fields, preparing the calculator for its next count.

Step 3: Record your results in a table with the % G~C content in one column and the name of the sequence in the second column. Fill the table out as you do your calculations. It should look like this:

%G~C ContentSequence NameType of Gene
48%Bone Morphogenetic Protein 5 (BMP5)Tissue specific promoter
... ... ...

Step 4: Sort the rows of the table, so that the highest %G~C content is at the top, the next highest percentage appears second, the third highest percentage appears third, and so forth. When you are done, the very lowest %G~C content should be at the bottom.

If you need or want to do a graph for your science fair project (it's almost always a good idea to do so), you can also do a bar chart showing the %G~C content for each gene.

Step 5: Draw your conclusion. What can you say about housekeeping promoters? How does their %G~C content compare to tissue specific promoters? How does the %G~C content of the housekeeping promoters compare to other genes of the sort that the promoters would regulate and control (bone morphogenetic protein 7, leptin, opsin, and cystic fibrosis genes)?

DNA Sequences for Your Experiment

Here are the sequences to use for your experiment. Note that these are partial sequences for the molecules; the full sequence is generally much longer.

Housekeeping Promoters:

  1. Heat Shock Protein 90 (HSP90): When proteins get over-heated, their folding and conformation gets messed up which often affects their function. Heat Shock Proteins repair the unfolded proteins back to their working state.
  2. Glucose-6-phosphate Dehydrogenase (G6PD): This molecule is a member of a team that helps protect each cell from agents that damage important proteins.
  3. Beta-actin (ACTB): Actin proteins help the cell make an internal "skeleton" that maintains the cell's proper shape.

Tissue Specific Promoters:

  1. Bone Morphogenetic Protein 5 (BMP5): Bone morphogenetic proteins help induce the growth of new bone.
  2. Hemoglobin Beta (HBB): Part of hemoglobin which carries iron molecules in blood cells.
  3. GABA Receptor A1 (GABRA1): An important receptor of chemical signals that travel only in the brain.

Bone Morphogenetic Protein 7 Genes:

Here is a partial DNA sequence from humans, pig, rabbit, and sheep for the Bone Morphogenetic Protein 7 gene (BMP7). You will notice that the sequences are not exactly the same. Bone Morphogenetic Proteins represent signals found in the body that help induce bone growth.

  1. Human BMP7
  2. Pig BMP7
  3. Rabbit BMP7
  4. Sheep BMP7

Leptin Genes:

Here is a partial DNA sequence from humans, cow, dog, and horse for Leptin (LEP), a signal found in the body that tells your brain how much fat you have stored away. Leptin may help regulate how hungry you feel. You will notice that the sequences are not exactly the same.

  1. Human Leptin
  2. Cow Leptin
  3. Dog Leptin
  4. Horse Leptin

Other genes:

  1. Here is a partial DNA sequence for the human Cystic Fibrosis gene (CFTR). In the body this gene's product is involved in making sure mucous doesn't build up in the lungs and that the pancreas secretes the right enzymes to help you digest your food. If this gene is damaged, a patient gets Cystic Fibrosis.
  2. Here is a partial DNA sequence for human Opsin1 (OPS1MW) Opsins are involved in providing color vision in the eye. Changes in the function of an opsin protein can lead to color-blindness.

If you like this project, you might enjoy exploring these related careers:

Career Profile
Imagine creating a new material, medicine, or electrical component that is too small to see. How would you design it? What could the new invention do? These are precisely the types of questions that nanosystems engineers answer every day. Nanosystems engineers design and build new technologies using the smallest building blocks, atoms, and molecules. Read more
Career Profile
Growing, aging, digesting—all of these are examples of chemical processes performed by living organisms. Biochemists study how these types of chemical actions happen in cells and tissues, and monitor what effects new substances, like food additives and medicines, have on living organisms. Read more
Career Profile
What do the sequencing of the human genome, the annual production of millions of units of life-saving vaccines, and the creation of new drought-tolerant rice varieties have in common? They were all accomplished through the hard work of biological technicians. Scientists may come up with the overarching plans, but the day-to-day labor behind biotechnology advances is often the work of skilled biological technicians. Read more


For the leptin and bone morphogenetic protein 7 genes you have data for different species. How would you describe the % G~C content for the same gene, but in different animals?

Share your story with Science Buddies!

I did this project Yes, I Did This Project! Please log in (or create a free account) to let us know how things went.

Ask an Expert

The Ask an Expert Forum is intended to be a place where students can go to find answers to science questions that they have been unable to find using other resources. If you have specific questions about your science fair project or science fair, our team of volunteer scientists can help. Our Experts won't do the work for you, but they will make suggestions, offer guidance, and help you troubleshoot.

Ask an Expert

Related Links

News Feed on This Topic

, ,

Looking for more science fun?

Try one of our science activities for quick, anytime science explorations. The perfect thing to liven up a rainy day, school vacation, or moment of boredom.

Find an Activity
Free science fair projects.