Page 1 of 1

BLASTing the flu virus

Posted: Fri Mar 09, 2012 7:10 am
by Powfam06
Hi,
I'm doing BLAST and have gotten my protien sequences and blasted them, but now I'm struggling to know what to do next with all of the information that comes up.
My question is if the influenza vaccine is affective for the year it was made. I've gone to CDC and originally, I was going to find the most common virus strain for those years and look for them in all the sequences that BLAST provided to match the one I blasted. However, CDC said that there were 617 different influenza virus strains found in one year!! Do you have any suggestions for how I can find the similarities with BLASTing vaccine viruses and those circulating for a certain year?
Thank you! Rachel

Re: BLASTing the flu virus

Posted: Sat Mar 10, 2012 3:32 pm
by deleted-71958
Hi Rachel!

I'm so happy that you're working with genetics! A large chunk of my research project is based on analyzing sequences from NCBI's database (among others) as well, so you're in luck!

I would suggest creating a phylogenetic tree with your BLAST results and your initial input sequence to locate the sequences that are most similar to yours. They should "pop out" as a distinct clade formed around your seeding sequence. Here are the steps to do so:

1. go to NCBI's "cobalt alignment" tool (http://www.ncbi.nlm.nih.gov/tools/cobal ... gi?CMD=Web), and copy & paste your BLAST sequences. If you got these 600 or so sequences from NCBI's database, they should already be in fasta format. Just click "download" on the output page, copy them into a plain text file, and upload them to COBALT. Press "align"

2a. Click the tab that says "phylogenetic tree" and it will automatically create a tree for you; an algorithm calculates the degree of "similarity" between all of your sequences, and will "sort them" into separate groups, which we call branches. Within these branches are clades, which are smaller groups of even more genetically similar sequences. The purpose is for you to find which sequences are similar to the one you initially found, right? Locating which clade your seeding sequences is in and looking at which other sequences populate the clade will tell you that. I would recommend skimming the wiki page for phylogenetic trees for a more detailed explanation of the method/reading the branches & clades.

2b. download this tree in "newick" format.

3. Go to iTol (interactive tree of life), and upload the tree from the newick file. This tool helps you visualize the tree in some pretty cool ways. You can even color code the clades according to sequence variability!

When all this is done (and it shouldn't take more than 30 minutes), you should be able to pick out your sequence homologues. If you need help analyzing the tree, or get confused about the directions, don't hesitate to post your questions or PM me! I'd love to see your tree when you're done!

Good luck!!
KT

Re: BLASTing the flu virus

Posted: Mon Mar 12, 2012 4:48 pm
by Powfam06
Hi,
ok, so we got to the site but I don't understand the download part of it. We BLAST, then on the page with all the matches we click download, and choices pop up, we've tried them but non will work in the COBALT, we're struggling. HELP!!! Rachel

Re: BLASTing the flu virus

Posted: Thu May 10, 2012 4:11 pm
by deleted-71958
Hi Rachel!

I'm so sorry about the delayed response: I've been out for for a while due to some personal experiences.

To answer your question/provide a guide for all future SBers looking at this thread:
1. HOW TO DOWNLOAD BLAST RESULTS ONTO AS A PLAIN TEXT/FASTA FILE: After your BLAST query comes out, your page should look somewhat like this: [img]file:///C:/Users/Katie%20Shao/Desktop/NCBI%20BLAST_2.PNG[/img]. (Here I have done a protein-BLAST; this is just like a nucelotide blast except I've substituted amino acids in lieu of nucleotides and selected "Protein BLAST"). Scroll down to the very bottom of the page, and check the box that says "Select All". Click "Get Selected Sequences", and it will take you to a new window that looks like this: [url]file:///C:/Users/Katie%20Shao/Desktop/NCBI%20BLAST_2.PNG[/url]. Go up to "Display Formats" at the top left-hand corner and select "FASTA (text)" and however many sequences you'd like to display in one window. Copy and paste the sequences onto a Plain Text file (Mac) or Notepad (PC). Save, and you're done! Now you are free to modify the sequences however you'd like, or run them through a multiple-alignment software like NCBI's COBALT or EBI's MUSCLE alignment.

2a. HOW TO MAKE A MULTIPLE-ALIGNMENT USING NCBI COBALT: If you want to go straight to a multiple sequence alignment instead of downloading the sequences in a file (see Step 1), simply scroll down to the bottom of your results page and click "multiple alignment". This will take you straight to NCBI's COBALT (multiple sequence alignment software). [url]file:///C:/Users/Katie%20Shao/Desktop/NCBI%20BLAST_3.PNG[/url] If you want to go straight to a phylogenetic tree of your results, see "Step 3" (below). If you'd like to view your MAS, click "Download" at the top of the page and select the option that says "fasta plus gaps". (Depending on the alignment viewing software you use, you might need Clustal or Nexus format.) Now you should have a plain text file that looks somewhat like what you see in the attached file.
Other software options for MAS is MUSCLE: http://www.ebi.ac.uk/Tools/msa/muscle/
and Phylogeny.fr: http://www.phylogeny.fr/version2_cgi/si ... logeny.cgi (see "Step 3" for guide on how to use it)

2b. HOW TO VIEW YOUR MAS: I like to use Jalview as my MAS viewer software because it's easy, manageable for beginning users, and pretty interactive (which means you can edit gaps in your sequences, delete sequences that seem to cause trouble in your overall alignment, truncate ends, etc, to improve phylogeny calculations) and you can output high-quality PDF images of your alignment to use on posters and whatnot! Here is the link: http://www.jalview.org/download.html You will have to download the software, but it's quick and safe, and I definitely recommend it if you are working with many sequences/will be making many alignments or editing phylogenetic trees.

3. HOW TO CREATE A PHYLOGENETIC TREE FROM YOUR MAS: If you used NCBI's COBALT software, there is actually a "Phylogenetic Tree" tab right on the top-left hand corner of your results page. That will lead to you something called a Phylo Tree, which is just a software for viewing phylogenetic trees. You can, of course, download the tree (I recommend Newick Format for iTOL--Interactive Tree of Life) and open the tree with iTOL (http://itol.embl.de/) which allows you to view the tree more comprehensively/edit it to make it poster-pretty.
Another option is to use Phylogeny.fr's "One-Click" option, and it will do all the sequence aligning, curation, and tree-rendering for you. Just input your sequences into their box and click Go! This software gives you several options of view the tree--but I myself find them a bit plain and lacking for poster purposes. If you choose to use this software I would still recommend downloading a Newick file of the completed tree and editing/adding colors with iTOL.

Well, there you go! If you'd like to take a look at a project that involves all these steps, you can visit https://sites.google.com/site/halophileproject2011/ and take a look at the page named "Microbial Rhodopsins."

Please let me know if there's any more questions--and I'll be happy to answer them!

Best,
KT

Re: BLASTing the flu virus

Posted: Thu May 10, 2012 4:25 pm
by deleted-71958
The pictures don't seem to be working (I've never posted pictures for SB yet, lol). I'll work on that--in the meantime, try to follow the steps as best as you can; NCBI is very workable for beginners, and if you need help navigating you can post back or ask your science teacher!

Good luck!