Page 1 of 1

BLASTing the corona virus

Posted: Sun Nov 29, 2020 1:55 pm
by deleted-955786
Hello. I am planning to use the design of the BLASTing Flu Virus project but for coronavirus. Since the vaccines for COVID 19 are not exactly confirmed yet, what do you suggest I can do with this project related to COVID 19?

https://www.sciencebuddies.org/science- ... eityourown

Thank you!

Re: BLASTing the corona virus

Posted: Sun Nov 29, 2020 9:20 pm
by brandimiller610
Hi SalikaSafar28,

I like your spin on the BLASTing Flu Virus project and your idea to study COVID-19 rather than specific influenza strains. However, because there is still a lot of unknown information about the vaccines, such as the strains of the SARS-CoV-2 virus they contain, I believe it may be a little bit early to conduct this project using the novel coronavirus. You may consider using other strains of coronavirus, such as MERS-CoV (Middle Eastern respiratory syndrome coronavirus) and SARS-CoV. I believe that studying these coronaviruses could be useful for understanding vaccine targets and therapeutics for COVID-19 and the current pandemic. Depending on how rapidly the vaccine is available and distributed and the amount of information about the SARS-CoV-2 strains that are published, it may be feasible for you to conduct the BLASTing Flu VIrus project for COVID-19.

This paper BLASTed three well-known coronaviruses, including COVID-19: https://www.nature.com/articles/s41423-020-0400-4
I like this paper because it discusses the implications for a COVID-19 vaccination and neutralization.

I hope my response has been helpful! Feel free to reach out if you have any other ideas you'd like to discuss or any questions.

--Brandi

Re: BLASTing the corona virus

Posted: Sun Dec 06, 2020 8:59 pm
by deleted-955786
Hello Brandi! Thank you so much for replying but how do I reach out to you? Sorry I'm new to this platform. Do I just post a reply to this post with my new ideas?

Re: BLASTing the corona virus

Posted: Mon Dec 07, 2020 7:51 pm
by brandimiller610
SalikaSafar28,

Yes, feel free to reply to this post with any questions/concerns you have about your project and we will be happy to answer them :)

--Brandi

Re: BLASTing the corona virus

Posted: Fri Jan 08, 2021 8:24 pm
by deleted-955786
https://www.sciencebuddies.org/science- ... #procedure

I tried following through with the procedure of 1b since I wanted to get the information that the table gives but for the last 3 years. But I don't follow from 1b ii and onwards. Is the procedure on this part up to date? if not can I get help locating the information for the years 2017, 2018, and 2019?

Re: BLASTing the corona virus

Posted: Sun Jan 10, 2021 6:38 pm
by deleted-955786
https://www.sciencebuddies.org/science- ... #procedure

It's the same link but the previous one didn't seem to be working

Re: BLASTing the corona virus

Posted: Sun Jan 10, 2021 9:11 pm
by brandimiller610
Hi SalikaSafar28,

Thank you for your question! When looking at the procedure, it seems that the table is outdated. However, you can find flu surveillance for more recent years (through 2019-2020 season) by going to the CDC website. The link is https://www.cdc.gov/flu/weekly/fluactivitysurv.htm; you can also go directly to https://www.cdc.gov/flu/season/past-flu-seasons.htm, where information about previous flu seasons is included.

I hope this helps answer your question and guides you in the right direction! Please feel free to respond on this forum if you have any more questions! Good luck on your project!

--Brandi

Re: BLASTing the corona virus

Posted: Mon Jan 18, 2021 3:17 pm
by deleted-955786
Hi Brandi! thanks for replying. Following the 2nd link, I clicked on the year 2019-2020. It takes me to a frequently asked questions page where I find the viral strains recommended to be in that year's vaccines but not exactly the antigenic characterization for the viral strains subtyped that season or the ones included in the vaccines of that season. Are those information to be found in another way?

Re: BLASTing the corona virus

Posted: Thu Jan 21, 2021 9:09 am
by deleted-955786
just an update: the vaccinations in the years 2007-2011 piqued my interest. The vaccines in the flu season 2007-2008 did not have many of the flu viruses subtyped that season while the vaccines in the flu season 2010-2011 had all of the subtyped viral strains. I think I'll look into how and whether that made a difference :?: .
One question though, when I search for the protein sequence for the surface proteins. For example, if I search for "hemagglutinin [Influenza A virus (A/California/07/2009(H1N1))]," there appears a lot of results with the same entry titles with the full sequence. How do you know which result to click on? The ones that show the highest number of the amino acid sequence? If so a lot of them show the same number of amino acids too, so does it matter which one of them I click on?

Thank you for the kindest help!

Re: BLASTing the corona virus

Posted: Mon Jan 25, 2021 12:12 am
by brandimiller610
In regard to your first question, I see where it states the recommended strains to be included in the vaccine every year without providing definite antigenic characterization of the vaccines. By clicking on many pages on the CDC's site, I found FluView, which provides weekly surveillance reports for the current season and past flu seasons. https://www.cdc.gov/flu/weekly/pastreports.htm
I could not find anything on the site that provides the actual antigenic characterization of the vaccine, but these surveillance reports did include antigenic characterization of positive flu specimens. Is this the type of data you were looking for?

When searching the surface protein, are you using GenBank (Nucleotide) on the NCBI website (https://www.ncbi.nlm.nih.gov/genbank/)? When typing in the hemagglutinin protein for the 2009 strain you indicated, the first six entries (each 1701 bp in length) are the same, while the seventh entry seems to be the complete sequence (1734 bp). If you are instead using Protein on the same website, then any of the sequences including 566 amino acids should be the correct one.

I hope I answered your questions. As always, please feel free to reach out if you have any more questions!

--Brandi :)

Re: BLASTing the corona virus

Posted: Thu Feb 25, 2021 10:42 am
by deleted-955786
Hey thanks so much for the reply and I'm making progress on my project. I've encountered a few issues that I couldn't troubleshoot by myself and would appreciate any help:
1. For hemagglutinin [Influenza A virus (A/Solomon Islands/3/2006(H1N1))] and hemagglutinin [Influenza B virus (B/Ohio/1/2005)], I only found the partial protein sequence (A full protein sequence would be ideal). Would this affect the percent identical (from BLAST) significantly given that my goal is to take an average of all the percent identical in a given flu season? Also could taking an average (including for the viruses subtyped in particular seasons that were used in the vaccine thus leading to a 100% match) be in any way a misrepresentation of the trend of data that shows the match in vaccine viral strains and viruses subtyped vs each season?
2. I couldn't find any result for the search "neuraminidase [Influenza A virus (A/Solomon Islands/3/2006(H1N1))]" on NCBI. Do you know what's wrong?
3. On the BLAST page itself, are the other numbers besides the percent identical, such as the E value that appears when using BLAST, is that of any immediate significance that should be taken into account? [for example here: https://blast.ncbi.nlm.nih.gov/Blast.cg ... &SUBJECTS=] the 2nd accession number being: AIU46082.1

Re: BLASTing the corona virus

Posted: Sun Feb 28, 2021 4:50 pm
by brandimiller610
Hi SalikaSafar28,

Thank you for your questions! Here is some insight I can provide:

1. It would be ideal to find the most complete sequences possible -- I think blasting with incomplete sequences will for sure affect the percent identical between strains during any single flu season. If I am understanding your second question under (1) correctly, I don't think averaging the match percents for all strains for each season (including those used in the vaccines) is a misrepresentation. Although showing averages can lead to skewed data due to extremely high or low values, I believe it is a simple way to easily compare matching among different strains with the vaccines each year. Please tell me if I have understood and answered your question correctly.

2. I am not sure why there are no results for the neuraminidase on this strain in NCBI. I also tried this search, and even changed neuraminidase to "NA". I got no data there, but did find the amino acid sequence on UniProt (https://www.uniprot.org/uniprot/A7Y8A4).

3. The E-value is important for determining the quality of the match (or the number of "hits"). Lower E-values indicate better matches between two protein sequences (in this case, strains). Therefore, I think it should definitely be taken into account when using BLAST.

Hope these answers help clear things up!

--Brandi :)

Re: BLASTing the corona virus

Posted: Sun Feb 28, 2021 10:15 pm
by MadelineB
Hello SalikaSafar28,

I have one suggestion to add to the excellent advice by the expert Brandi:

I agree with Brandi that "showing averages can lead to skewed data due to extremely high or low values." I suggest that you use the median rather than the average as "a simple way to easily compare matching among different strains with the vaccines each year." The median is less influenced by extremely high or low values.

Hope this helps with your project.

Madeline

Re: BLASTing the corona virus

Posted: Mon Mar 01, 2021 9:32 am
by deleted-955786
Thank you both so much! Yes, Brandi, you did understand my question correctly. I'll take your responses into consideration about mean/median while I do a bit more research on how BLAST data should be analyzed. I also appreciate you directing me to Uniprot. I used the FASTA from there and ran the BLAST on NCBI.

Work in progress...

Re: BLASTing the corona virus

Posted: Tue Mar 02, 2021 10:43 am
by brandimiller610
I agree with Madeline about using the median rather than the mean for your data. Good luck with your project!

Re: BLASTing the corona virus

Posted: Wed Mar 03, 2021 3:04 pm
by MadelineB
Hello SalikaSafar28,

You might consider showing both the medians and the means so that the judges could judge whether the means were influenced by unusually high or low percentages. You might consider plotting those 2 numbers by year, and, if you are keeping strains separate, even by strain. This would let you (and the judges) see if the data for some years was skewed.

And, I also agree with Brandi that you would want to tabulate the partial sequences separately.

Good luck with your project.

Madeline