Abstract
Finding a cure for cancer is one of scientists' greatest challenges today. But first, they have to study and understand the disease. In this science project, you will explore the software available on the Cancer Genome Anatomy Project (CGAP) website and use bioinformatics tools to identify genes whose level of expression is higher in cancer tissue.Objective
The goal of this science project is to use web-based bioinformatics tools to identify genes that are over-expressed in pancreatic cancer tissue. You will use the online tool called SAGE Genie, available at the website for the Cancer Genome Anatomy Project (CGAP). This project will give you an idea of how bioinformatics tools can be used to explore large datasets of biological information.
Introduction
The Cancer Genome Anatomy Project (CGAP), a program of the National Cancer Institute (NCI), studies the molecular changes that occur when a normal cell is transformed into a cancer cell. Cancer is a term for diseases in which abnormal cells divide without control and can invade other tissues. Cancer cells can arise in many different parts of the body. The five deadliest types of cancer, and the number of deaths they cause per year (estimated for 2008) in the United States are listed below:
| Type of Cancer | Number of Deaths per Year in the U.S. |
| Lung | 162,000 |
| Colorectal | 50,000 |
| Breast | 40,000 |
| Pancreatic | 34,000 |
| Prostate | 30,000 |
A complete list can be found at the National Cancer Institute's website: Common Cancers.
Cancer researchers are interested in finding biological differences between cancerous tissue and normal tissue because these differences may be "Achilles' heels" that they can exploit to fight the cancer. For example, cancer cells often have patterns of gene expression that differ from their normal cell counterparts. Some genes are expressed at higher levels (over-expression), and others are expressed at lower levels, or not at all.
Genes that are over-expressed in the cancerous tissue are of particular interest because over-expression is a trait you would expect of a gene that is causing the cancer to grow. For example, if a gene codes for a protein that usually functions to cause a cell to divide, then making more of this protein may lead to uncontrolled growth (HER2, for example). On the other hand, other gene products that normally function to control or inhibit growth, may be lost as a normal cell becomes a cancer cell (PTEN, for example). If we compare the cell to a car, over-expression of a growth-stimulating gene is analogous to jamming the accelerator down, whereas loss of an inhibitory gene is analogous to losing the ability to apply the brakes.
Figure 1 shows a theoretical "profile" of gene expression for eight genes in normal and cancerous tissues. It is possible that the over-expressed gene, #8, may be helping the development and growth of the tumor. There are many other criteria that cancer researchers look at to determine the precise role of a gene in cancer, but gene expression is a key factor, and it is one that you can explore without getting your lab coat on!
|
| Figure 1. Shown here is theoretical gene expression in normal and cancerous tissues. Comparing the "profile" of normal versus cancerous tissues highlights genes like #8 that deserve further study because of their suspiciously high expression level. |
In this genomics science fair project, you will use software tools to identify molecular-level differences between normal human tissue and cancerous tissue. Specifically, you will identify genes that are over-expressed in cancerous tissue derived from the pancreas. The pancreas is a gland, about 6 inches long, that lies behind the stomach. Its two main functions are 1) to produce juices that help digest food, and 2) to produce hormones, such as insulin and glucagon, which help control blood-sugar levels. The digestive juices are produced by exocrine pancreas cells and the hormones are produced by endocrine pancreas cells. About 95 percent of pancreatic cancers begin in exocrine cells.
The method used by CGAP to study gene expression is called SAGE, which stands for Serial Analysis of Gene Expression. This method allows researchers to compare the expression of thousands of genes in different tissue types. If you are interested in finding out more about this technology, the CGAP website has great information.
An excellent way to really explore gene expression is to use the bioinformatics tools on the CGAP website, which allow you to perform "virtual experiments" on real data sets of gene expression data. The goal of this project is to become familiar with a set of tools used to explore the cancer genome. Once you have followed the steps outlined below, you can use these tools to independently identify genes that are over-expressed in cancer.
Terms, Concepts and Questions to Start Background Research
Before you perform the bioinformatics search using the tools at the CGAP website, make sure you have a good understanding of these terms and concepts:
Bibliography
Materials and Equipment
Experimental Procedure
So what exactly did you just set up? The program is now set to compare gene expression in normal pancreatic tissue (Pool A) to gene expression in cancerous pancreatic tissue (Pool B). You will be looking for genes that are over-expressed in cancerous tissue, since they might be involved in advancing the cancer disease process. You want to compare normal vs. cancer in the same tissue type—in this case, the pancreas—so that you are not identifying genes whose differential expression is due to the fact that the tissues are from different organs.
(Under-expressed genes can also provide very important clues about what is happening in cancer cells, but for the purposes of this project, you'll focus on over-expression.)
In the next step, the software will identify genes that are differentially expressed in these two pools.
Take a look at the table of genes that you have generated.
Here is one entry, for the CEL gene, where B is not greater than A:
Table 1. CGAP data for the CEL gene. (CGAP website, n.d.) |
CEL is expressed in the normal tissue library, but not in either of the cancer tissue libraries. The tag for this gene is found 399 times in the normal tissue library, and not at all in the cancer Pool B. So this gene is shut off when the tissue becomes cancerous.
But we want genes that are over-expressed in the cancer Pool B. Here is an example of a gene with high cancer-associated gene expression:
Table 2. CGAP data for the CLASP2 gene. (CGAP website, n.d.) |
The CLASP2 gene is found in both cancer libraries, but not in the normal library. It was found 3619 times in the pool of genes from the cancerous tissue. This looks like a gene that is really over-expressed in pancreatic cancer.
Figure 2. CLASP2 SAGE data. (CGAP website, n.d.) |
Figure 3. CLASP2 is over-expressed in pancreatic cancer. Blue indicates average expression level, red indicates high expression level. (CGAP website, n.d.) |
Table 2. CGAP data for the CLASP2 gene. (CGAP website, n.d.) |
This link brings up the Gene Info page. This page has a wealth of information about the CLASP2 gene. Imagine how long it would take you to find this information by yourself! There has been tremendous progress in the last few years annotating all of the human genes and proteins.
This final page gives a visual look at the expression level of CLASP2 in normal and cancerous tissues. As you scroll down, note that CLASP2 is over-expressed in pancreatic cancerous tissue, compared to normal pancreatic tissue.
Figure 4. Graphical image of CLASP2 gene expression. Note that CLASP2 expression is very high in pancreatic cancerous tissue, but not in normal pancreatic tissue. (CGAP website, n.d.) |
Variations
Credits
David Whyte, PhD, Science Buddies
Last edit date: 2008-04-30 22:00:00
If you like this project, you might enjoy exploring careers in Genetics & Genomics.
![]() |
Genetic Counselor Many decisions regarding a person's health depend on knowing the patient's genetic risk of having a disease. Genetic counselors help assess those risks, explain them to patients, and counsel individuals and families about their options. |
![]() |
Database Administrator Databases are collections of similar records, like the products a company sells, information on all people with a driver's license for a state, or the medical records in a hospital. Database administrators have the important job of figuring out how to organize, access, store, search, cross-reference, and protect all those records. Their services are needed by law enforcement, government agencies, and every type of business imaginable. Management of large databases is also critical for scientific research, including understanding and developing cures for diseases. | |
|
Join Science Buddies
Become a Science Buddies member! It's free! As a member you will be the first to receive our new and innovative project ideas, news about upcoming science competitions, science fair tips, and information on other science related initiatives. |