Page 2 of 3

Re: BLASTing Flu Viruses

Posted: Sat Oct 03, 2009 2:27 pm
by swimmy

[The extension docx has been deactivated and can no longer be displayed.]

this is how my graph looks as of now...
I don't think it is the correct type of graph, though...

Re: BLASTing Flu Viruses

Posted: Sat Oct 03, 2009 2:31 pm
by swimmy
and my whole science fair writeup (excluding graph and title):
What is influenza?
Influenza, or the flu for short, is a contagious respiratory infection. A tiny virus that can spread through an infected person’s body, causing coughs and sneezes, causes the flu. The flu causes fever, headache, sore throats, body aches, sneezing, coughing, sensitivity to light, and fatigue. The infection attacks nasal passages and lungs. The flu season is from late fall to early spring. “Flu seasons” refers to the period of time in which flu outbreaks are common. Contrary to popular belief, the flu is not just a worse version of the common cold. The common cold doesn’t have complications involved with the flu, usually pneumonia. The two viruses aren’t even distantly related. The flu virus looks like a ball with two types of spikes coming out of it.

What are some of the characteristics of different types of flu?
There are three main types of the flu, types A, B and C. Type A is the most serious, because it can infect both humans and animals. Type A is usually the type that causes global pandemics. Type B is also a major concern, as it mostly affects children and causes them to be absent from school. It has also become resistant to major antiviral medications in Asia. However, Type B flu is contained in humans only. Type C is less severe than its counterparts Type A and Type B. It can cause outbreaks, though less serious than those caused by the other types. Type C is the most like the common cold, and protection against it is not in the annual vaccine because it is so mild compared with the other two.

How are strains of the flu named?
First, the strains of the flu are sorted by their type: A, B, or C. Then they tell you where it came from. This is where the strain was first isolated. Next, there is a number indicating the number of viruses isolated in that certain year and location. Then, the strain’s name tells the year it was isolated. After that, they tell you the specific surface proteins (hemagglutinin and neuraminidase). There are 15 hemagglutinin proteins, but only 3 occur in human influenza. In human influenza, there are 3 neuraminidase proteins out of the 9 total. All hemagglutinin and neuraminidase proteins are associated with avian flu.


Why is the swine flu such a big deal?
Flu viruses are coated with proteins that attack our receptor molecules on the outer surfaces of our cells. These molecules bind, and help the virus force our cells to make viral proteins and genes. In the flu, the proteins overpowering our cells are called hemagglutinin (HA) and neuraminidase (NA). In abbreviation, these are H and N proteins with numbers that show an antibody response to them. For this reason, they’re also called antigens. Antigens help to figure out how flu strains will affect humans or animals.
Some strains of the flu have antigens bind two different animal hosts, with one being closer to humans. This host has antigens related to flu strains in both birds and humans. For the swine flu, the host that was closer to humans was swine, and the other host was avian. Pig cells are lined with receptors that could allow both human and avian strains of flu. By chance, a pig was infected with bird and human flu at the same time, and both viruses were present in one of its cells, the viruses could easily mix genes inside the cell to create new flu strains, able to infecting humans and birds. Once the avian flu passed to swine it could easily be transmitted to humans and reach pandemic levels.
What is a mutation?
A mutation is a change of a gene’s normal sequence. They are caused by environmental factors or mistakes made while the gene was being copied. Mutations in surface protein genes allow viruses to remain infectious year after year.
How are vaccines made?
The process by which vaccines are made is long and requires an unbelievable amount of fertilized chicken eggs. (This process has gone through only a few changes since its development in the 1940s, although health officials complain about the time-consuming processes.) Currently, it takes 6 months and more than 100 million hen eggs to create seasonal flu vaccines. Apart from this, viral surveillance need time to figure out which viruses need a vaccination against.

First, viral samples from people who are sick with the flu from various parts of the world are collected. Viruses to be in the vaccine are decided by the World Health Organization (WHO). While this is happening, vaccine manufacturers buy millions of eggs for the influenza vaccine. The Food and Drug Administration (FDA) prepares the viral samples collected for use in vaccines. The viral preparation goes into the fluid around the chick embryo eleven days after the egg has been fertilized. The embryos are infected, and after being incubated for a week. Then the viral material is deactivated and purified.

What is the objective of this experiment?
The objective of this experiment is to see whether flu viruses from different seasons mutate at the same regions of protein. If a single protein is mutating between strains, then the protein can make the virus more contagious and therefore possibly more resistant to vaccines. If a set of proteins is simultaneously mutating between strains, then the information gathered will not be as clear.

1) Go to the Flu Activity & Surveillance page at The U.S. Centers for Disease Control and Prevention (CDC) website: www.cdc.gov/flu/weekly/fluactivity.htm.

2) Click on "Go" next to the year 2008-2009 “Current Weekly Influenza Report”.

3) Read the information on the page. In particular, find the paragraph with the heading "Antigenic Characterization." It will tell you the name of the virus strains prevalent in this year.

4) You can obtain the sequences for these strains from the NCBI GenBank website: www.ncbi.nlm.nih.gov/Genbank/.

5) Select "Protein" with the “Search” drop-down menu. Type in the name of the flu strain (use A/California/07/2009 (H1N1) in the search box.

6) Full-length HA is 566 amino acids. In order to retrieve full length-entries, add this text to the search box AND 566[SLEN]. Click "Go."

7) Click on the active link for the HA protein page.

8) Copy the accession number for the full-length HA protein. Click on the "BLAST Sequence" link (column on right side of page).

9) The BLAST page will pop up. The database should be set automatically to "non-redundant protein" and the algorithm should be "blastp." Click the BLAST button.

10) The BLAST report contains alignments of your search sequence against sequences in the database. The regions of the protein that is most likely to change show up as blank spaces.

11) Repeat Steps 1-10 with past flu seasons (2001-2008) and compare. Use the strain which the greatest number of flu strains related to it. It is found in the section called “Antigenic Characterization.”

-computer with Internet access

My hypothesis is that flu viruses from different seasons do not mutate in the same regions of protein because they are from different strains.

BLAST (Basic Local Alignment Search Tool) is the online tool used in this experiment. It can be found at: www.ncbi.nlm.nih.gov/BLAST. BLAST is a set of programs used to find alignments between a nucleotide or protein sequence, and nucleotide or protein sequences within a database. I used protein blast because I had amino acid sequences. Protein BLAST is BLASTp. There is no sequence in the query box because I use accession numbers. Accession numbers are numbers given to a sequence so that the publishers can track the different versions of it. After you click BLAST, then there is a list of alignments. The crosses and spaces refer to changes. The reason why the crosses are not counted in the mutations is because these changes are conservative, meaning if they were changed, survival of the virus would be difficult.

From my results, my hypothesis was incorrect. There were two years where the protein had mutated in the same region. However, this was due to the same strain of influenza being prevalent in the two years. My explanation of why the regions of protein where mutations take place would be the same was correct, though.

Please answer the questions from October 2 to current.

Re: BLASTing Flu Viruses

Posted: Sat Oct 03, 2009 4:02 pm
by swimmy
Also, would it be more controlled if I did just H1 subtype flu?
because that's what I did...

Re: BLASTing Flu Viruses

Posted: Mon Oct 12, 2009 3:11 pm
by MichaelD
swimmy wrote:Also, would it be more controlled if I did just H1 subtype flu?
because that's what I did...
Hi Swimmy sorry for the late reply. I think it would be easier to present the data if you only concentrated on one subtype of the flu, so I think you are ok there. As far as the graph, I think you are on the right track and it may be worth considering doing multiple graphy, eg. one for each protein you are looking at.

Mike

Re: BLASTing Flu Viruses

Posted: Sun Dec 27, 2009 9:01 pm
by swimmy
In the Blast Results page, there are three lines: Subject, one that doesn't say anything, and Query. What do each of these lines mean?
also, how come the A/Solomon Islands/3/2006 doesn't have a 566 letter sequence?

Score = 957 bits (2475), Expect = 0.0, Method: Compositional matrix adjust.
Identities = 451/566 (79%), Positives = 498/566 (87%), Gaps = 1/566 (0%)

what do each of these mean?

Re: BLASTing Flu Viruses

Posted: Mon Dec 28, 2009 12:07 am
by swimmy
On the CDC page, it says that viruses "are tested". Does this mean the number of flu cases or just viruses obtained?
I want to find how many people got infected with the flu.

Re: BLASTing Flu Viruses

Posted: Mon Dec 28, 2009 12:30 am
by swimmy
How's this for my county science fair?

Question: Does sequence alignment really show how effective a flu vaccine is?

Hypothesis: My hypothesis is that sequence alignment really shows how effective a flu vaccine is.

Methods/Procedures:
1) Go to the Flu Activity & Surveillance page at The U.S. Centers for Disease Control and Prevention (CDC) website: http://www.cdc.gov/flu/weekly/fluactivity.htm.

2) Click on "Go" next to the year 2009-2010 “Current Weekly Influenza Report”.

3) Read the information on the page. Go to the section called "Antigenic Characterization". It will tell you how many cases of different flu happened that year.

4) Go to the Influenza Virus Vaccine Composition and Lot Release at the US Food and Drug Administration (FDA) website: http://www.fda.gov/BiologicsBloodVaccin ... 062928.htm

5) Click on the link for "Influenza Virus Vaccine for the 2009-2010" It will tell you what strains were used in the vaccine for that year.

6) You can obtain the sequences for these strains from the NCBI GenBank website: http://www.ncbi.nlm.nih.gov/Genbank/.

7) Select "Protein" with the “Search” drop-down menu. Type in the name of the flu strain (use A/California/07/2009) in the search box.
Full-length HA is 566 amino acids. In order to retrieve full length-entries, add this text to the search box AND 566[SLEN]. SLEN means sequence length. Click "Go."

7) Click on the active link for the HA protein page.

8) Copy the accession number for the full-length HA protein. Click on the "BLAST Sequence" link (column on right side of page).

9) The BLAST page will pop up. The database should be set automatically to "non-redundant protein" and the algorithm should be "blastp." In the "Entrez" query, type 2009[WORD] Click the BLAST button.

10) The BLAST report contains alignments of your search sequence against sequences in the database. See the scores of how much percent the flu vaccine strain is similar to those strains that emerged that year.

11) Repeat Steps 1-10 with past flu seasons (2001-2008) and compare. Use these strains: A/Brisbane/59/2007, A/Solomon Islands/3/2006, and A/New Caledonia/20/99. There are only three others because strains that were prevalent were common during a few years.

For the procedure, I think I'm missing some steps, but I can't seem to figure out what they are.
Basically, I'm comparing flu strains from one year with the vaccine from that year, and then seeing if more people get the flu when the vaccines are not that similar.
Should I use a separate flu virus sequence, or just see how similar the vaccine is with the database for that year?

Re: BLASTing Flu Viruses

Posted: Mon Dec 28, 2009 5:29 pm
by swimmy
If I do this should I do BLAST searches for all three vaccine strains (since every year they have 3)?

Re: BLASTing Flu Viruses

Posted: Mon Dec 28, 2009 5:33 pm
by burtonboi
I am so sorry to do this. But I cannot figure out how to post a new topic in the sci fair forum. So sorry but I really in this help

Re: BLASTing Flu Viruses

Posted: Mon Dec 28, 2009 5:59 pm
by swimmy
burtonboi wrote:I am so sorry to do this. But I cannot figure out how to post a new topic in the sci fair forum. So sorry but I really in this help
go to "active forums" i don't know what grade/topic you are doing... and then theres a button for new topic. then you can post your topic

Re: BLASTing Flu Viruses

Posted: Tue Dec 29, 2009 3:57 am
by MelissaB
Swimmy,

I think the procedure looks very good--until you get to steps 10 and 11. I assume you are getting the strains that emerged that year from step 3? Do those strains automatically pop up when you do the search in step 10? If not, you might want to add a little bit more detail here.

I don't understand why you say, "Use these strains: A/Brisbane/59/2007, A/Solomon Islands/3/2006, and A/New Caledonia/20/99". Use them as what--vaccine strains, or strains that were prevalent? I don't think they were all the most prevalent three strains from 2001-2008.

You then might want to have a step where you specifically compare sequence differences to flu prevalence at the end, after step 11.

I would stick with the 'A' strains--as I recall, the 'B' strains both don't change much from year to year and also don't cause much flu relative to the 'A' strains. But I would do both 'A's in the vaccine, though.

I have helped burtonboi elsewhere, but thank you for replying to him :).

Good luck--this is a great project!

Re: BLASTing Flu Viruses

Posted: Tue Dec 29, 2009 10:32 am
by swimmy
Sorry, step 11 should be like this:

Repeat Steps 1-10 with past flu seasons (2001-2008) and compare. Use these strains (the vaccines): A/Brisbane/59/2007, A/Solomon Islands/3/2006, and A/New Caledonia/20/99. There are only three others because some years they used the same strains for the vaccine.

Do you know what site the flu cases are on? Is it the CDC flu seasons pages?

Re: BLASTing Flu Viruses

Posted: Tue Dec 29, 2009 10:52 am
by MelissaB
It's under the 'antigenic characterization' page--that's why I thought you were going to that page in step 3.

Re: BLASTing Flu Viruses

Posted: Tue Dec 29, 2009 11:51 am
by swimmy
Is it like this?
For example, 2008-2009
CDC has antigenically characterized 2,209 seasonal human influenza viruses [1,192 influenza A (H1), 276 influenza A (H3) and 714 influenza B viruses] collected by U.S. laboratories since October 1, 2008, and 822 2009 influenza A (H1N1) viruses.

Re: BLASTing Flu Viruses

Posted: Wed Dec 30, 2009 1:42 am
by MelissaB
No, here: "Of the 407 influenza A (H1N1) viruses, 270 (66%) were characterized as antigenically similar to A/Solomon Islands/3/2006, the influenza A (H1N1) component of the 2007--08 Northern Hemisphere influenza vaccine. One hundred sixteen (29%) viruses were characterized as A/Brisbane/59/2007-like. Of the 404 influenza A (H3N2) viruses, 91 (23%) were characterized as similar to A/Wisconsin/67/2005, the influenza A (H3) component of the 2007--08 Northern Hemisphere influenza vaccine. Two hundred forty-three (60%) viruses were characterized as A/Brisbane/10/2007-like."

Re: BLASTing Flu Viruses

Posted: Wed Dec 30, 2009 11:32 am
by swimmy
Thank you so much!

Re: BLASTing Flu Viruses

Posted: Wed Dec 30, 2009 1:31 pm
by InfluenzaScienceFair
Hello,

I have had similar problems with this project. Many of the prior posts have helped me a lot, but I still am kind of confused. My question was "Based on a certain year’s influenza outbreaks, how effective are the vaccines for the following year?" Is that reasonable?

Re: BLASTing Flu Viruses

Posted: Thu Dec 31, 2009 2:15 am
by MelissaB
HI,

I'm not sure what you mean by, 'based on a certain year's influenza outbreaks'. What about those outbreaks will you measure? Strength of the flu season? Viral sequences? If you can explain it a bit more, I might be able to help you.

Re: BLASTing Flu Viruses

Posted: Sat Jan 02, 2010 1:12 pm
by swimmy
In BLAST, is there any way to limit the query to America?
The CDC has information about the cases in America, but the BLAST has strains from around the world...
Also, is there anywhere that says how many vaccines were administered?

Re: BLASTing Flu Viruses

Posted: Sun Jan 03, 2010 12:35 pm
by InfluenzaScienceFair
I was thinking of the viral sequences, yes. Since the vaccine for the following year is composed of the most prominent sequences from the prior year's outbreaks, I wanted to find the percentage of effectiveness in the vaccine. I don't really know if that makes sense, but it was just a thought.

Re: BLASTing Flu Viruses

Posted: Sun Jan 03, 2010 2:13 pm
by MelissaB
Swimmy,

I know of no way to limit the search to America--but the good news is that the flu is pretty pandemic, so the sequences you get are likely to have pretty much been around the world. As for the doses, I couldn't find exact doses, but I found percent coverage of different groups (those 65 and older, those 18-41 with certain risk conditions, etc.) here: http://www.cdc.gov/flu/mmwrref.htm .

ISF,

That's certainly do-able, but you'll find that it varies widely from year to year--some years the last seasons' sequences work very well, and some (like this year!) they work very poorly. I would thus suggest having another variable, or considering a different question.

Re: BLASTing Flu Viruses

Posted: Sun Jan 03, 2010 2:40 pm
by InfluenzaScienceFair
All right, thanks! So if I do choose to do that question, how would I be able to find the percentage of effectiveness?

Re: BLASTing Flu Viruses

Posted: Fri Feb 12, 2010 10:38 am
by swimmy
I'm trying to see how similar the sequences are.
Do I use Identities, Positives or Gaps?
I found this in the BLAST tutorial:
Similarity
The extent to which nucleotide or protein sequences are related. The extent of similarity between two sequences can be based on percent sequence identity and/or conservation. In BLAST similarity refers to a positive matrix score.
I don't really understand. I think I use the positives to find the similarity.
Is this right?

Re: BLASTing Flu Viruses

Posted: Sun Feb 14, 2010 2:50 am
by MelissaB
Hi guys,

I'm so sorry that no one has answered your questions yet! ISF, I hope that you have figured out that I talked about how to measure effectiveness earlier in the thread? If not, let me know and I'll help you out with that.

Swimmy, I'm no expert in BLAST, but I just looked it over and I think you should use identities. That seems to be the number of nucleotides the two sequences have in common divided by the total number of nucleotides. The percentage you get after is the percentage similarity.

Re: BLASTing Flu Viruses

Posted: Mon Feb 15, 2010 5:11 pm
by MichaelD
swimmy wrote:I'm trying to see how similar the sequences are.
Do I use Identities, Positives or Gaps?
I found this in the BLAST tutorial:
Similarity
The extent to which nucleotide or protein sequences are related. The extent of similarity between two sequences can be based on percent sequence identity and/or conservation. In BLAST similarity refers to a positive matrix score.
I don't really understand. I think I use the positives to find the similarity.
Is this right?
Yes that is correct. The positives will indicate residues which are similar (homologous) in properties, acidic, basic, hydrophobic, etc. Identities indicate an exact match. Percent positive will be a good measure of similarity between two sequences.

Mike