Hello,
I am a high school student and I am interested in the above bioinformatics project for upcoming science fair. I read the process and tried it for several diseases like NF1, Aneurysm and MS. My issue is that I cannot find the protein domain related to my SNP. Does it mean that the SNP I am looking at has no protein domain associated with it. (pls see attached). How can a coding area have no domain associated.
Also I am not sure what could be the aim of my research. Is it possible for me to find a link between a disease(predisposition) and an SNP just by looking at protein domain that has functions related to the disease symptoms. If I am reading PubMed papers related to that, hasn't it been already been established. Isn't the assumption that I can find a relation a far fetched one.
I have been looking at this topic for a few weeks now but cannot come up with a goal or hypothesis. Please help.
Thanks
The Perfect Marriage of Computer Science & Medicine
Moderators: AmyCowen, kgudger, MadelineB, Moderators
-
deleted-379366
- Posts: 3
- Joined: Mon Sep 12, 2016 2:40 pm
- Occupation: Student
The Perfect Marriage of Computer Science & Medicine
- Attachments
-
[The extension docx has been deactivated and can no longer be displayed.]
-
deleted-368865
- Former Expert
- Posts: 22
- Joined: Tue Aug 02, 2016 1:10 pm
- Occupation: Doctoral Student
- Project Question: N/A
- Project Due Date: N/A
- Project Status: Not applicable
Re: The Perfect Marriage of Computer Science & Medicine
Hi TejpalKaur,
When scientists refer to protein domains, we are talking about areas within the coding sequence that have specific structures/known functions. For example, transcription factors, which are proteins that bind DNA to initiate or inhibit transcription of a gene, can have a number of different domains. The first one that comes to mind is the zinc-finger domain. This domain has a very specific structure that facilitates binding to DNA. However, the protein isn't just one giant zinc-finger domain. They typically have a number of zinc-finger domains spaced along the coding sequence. That being said, there are areas of coding sequence that aren't associated with a particular domain (but may still be important for the protein to function properly).
Of course if scientists are looking at what genetic factors may cause a particular disease, they might look into SNPs. But this is labor intensive. This requires sequencing the DNA from a number of patients and looking for A) genes that are commonly affected and B) if those genes are affected in the same way (i.e. if the patients have SNPs in common). You can imagine that when scientists sequence a patient's DNA that they are generating a ton of data to go through. This is where bioinformatics comes in. Scientists can create programs to look for SNPs compared to a consensus sequence and then compare those SNPs to each other. This is much easier than going through by eye/hand to find these differences. Creating these programs are not always easy though. And just because someone generates a lot of data does not mean that they have thoroughly gone through all of it -- so it is possible to find an association between a particular SNP and a disease that hasn't already been identified.
Perhaps your project should take the angle of "proof of principle." What I mean by this is maybe your project can focus on validating what you find in papers. So, if a paper claims that a particular SNP is important for a specific disease, and they showed this through a series of experiments (likely in mice), maybe you can try to show that humans that suffer from this disease also have a the same SNP.
I'm not sure if this will be helpful or not, but I hope it was.
Best,
Hannah
When scientists refer to protein domains, we are talking about areas within the coding sequence that have specific structures/known functions. For example, transcription factors, which are proteins that bind DNA to initiate or inhibit transcription of a gene, can have a number of different domains. The first one that comes to mind is the zinc-finger domain. This domain has a very specific structure that facilitates binding to DNA. However, the protein isn't just one giant zinc-finger domain. They typically have a number of zinc-finger domains spaced along the coding sequence. That being said, there are areas of coding sequence that aren't associated with a particular domain (but may still be important for the protein to function properly).
Of course if scientists are looking at what genetic factors may cause a particular disease, they might look into SNPs. But this is labor intensive. This requires sequencing the DNA from a number of patients and looking for A) genes that are commonly affected and B) if those genes are affected in the same way (i.e. if the patients have SNPs in common). You can imagine that when scientists sequence a patient's DNA that they are generating a ton of data to go through. This is where bioinformatics comes in. Scientists can create programs to look for SNPs compared to a consensus sequence and then compare those SNPs to each other. This is much easier than going through by eye/hand to find these differences. Creating these programs are not always easy though. And just because someone generates a lot of data does not mean that they have thoroughly gone through all of it -- so it is possible to find an association between a particular SNP and a disease that hasn't already been identified.
Perhaps your project should take the angle of "proof of principle." What I mean by this is maybe your project can focus on validating what you find in papers. So, if a paper claims that a particular SNP is important for a specific disease, and they showed this through a series of experiments (likely in mice), maybe you can try to show that humans that suffer from this disease also have a the same SNP.
I'm not sure if this will be helpful or not, but I hope it was.
Best,
Hannah
-
deleted-379366
- Posts: 3
- Joined: Mon Sep 12, 2016 2:40 pm
- Occupation: Student
Re: The Perfect Marriage of Computer Science & Medicine
Hi Hannah,
Thanks for your reply. I am sorry I was busy with school tennis team and now getting back to the project. I looked up some more diseases and also tried to find some information on ‘Proof of principal’ project. I came across the following article which lists several genes identified for disease in dogs.
http://www.nature.com/articles/ncomms10460
Could please look at the table1 published and suggest if the genes listed under Lymphoma – MCC, MXD3, FGFR4 and GC(Colitis)- CD48 can be good candidates to study in humans. If yes, How can I go about comparing the two. Do I just go to the database and check all pathogenic SNPS and compare them between dogs and humans. Please let me know. The study does not publish the specific SNP locations. Is this idea worth some merit?
On totally different note, can I do a study of cell phone radiation on SNPs. Is this possible? I read a Science buddies article about a student doing ultraviolet study of SNPs but I cannot find the article any more. My genetics teacher on the class mentioned that if you have a project she can arrange for you to work in a lab with PCR.
Thanks
Thanks for your reply. I am sorry I was busy with school tennis team and now getting back to the project. I looked up some more diseases and also tried to find some information on ‘Proof of principal’ project. I came across the following article which lists several genes identified for disease in dogs.
http://www.nature.com/articles/ncomms10460
Could please look at the table1 published and suggest if the genes listed under Lymphoma – MCC, MXD3, FGFR4 and GC(Colitis)- CD48 can be good candidates to study in humans. If yes, How can I go about comparing the two. Do I just go to the database and check all pathogenic SNPS and compare them between dogs and humans. Please let me know. The study does not publish the specific SNP locations. Is this idea worth some merit?
On totally different note, can I do a study of cell phone radiation on SNPs. Is this possible? I read a Science buddies article about a student doing ultraviolet study of SNPs but I cannot find the article any more. My genetics teacher on the class mentioned that if you have a project she can arrange for you to work in a lab with PCR.
Thanks
-
deleted-379366
- Posts: 3
- Joined: Mon Sep 12, 2016 2:40 pm
- Occupation: Student
Re: The Perfect Marriage of Computer Science & Medicine
Hi Hannah,
I hope you can find some time to answer my questions. As I mentioned in my last post I am trying to compare the genes in dogs that have markers for disease(as published in the study) to that of humans. I started with CD48 for Colitis. I have compared the two using http://fasta.bioch.virginia.edu/fasta_w ... rm=compare but I am not sure how to interpret the results.(I am not sure if you copy and paste the 'whole seq' of the 'selected region' of FSTA from NCBI site database - I just copied the selected regions but they did not have same numbers for dogs and humans). I have attached the FASTA comparison here. I also tried to compare the SNPS in both but did not find any.Is it possible for you to explain. Also will I find similar RSxxxxx numbers if there are similar SNPs in both human and dogs in CD48 sequence or there are other ways to identify them.
Also, I requested the author of the study to give me exact location of SNPs in CD48 for dogs and she has sent them to me, but they are canFam3 coordinates. I am not sure how can I compare those markers in CD48 of a dog to the same sequence in humans.
Any help is appreciated.
I hope you can find some time to answer my questions. As I mentioned in my last post I am trying to compare the genes in dogs that have markers for disease(as published in the study) to that of humans. I started with CD48 for Colitis. I have compared the two using http://fasta.bioch.virginia.edu/fasta_w ... rm=compare but I am not sure how to interpret the results.(I am not sure if you copy and paste the 'whole seq' of the 'selected region' of FSTA from NCBI site database - I just copied the selected regions but they did not have same numbers for dogs and humans). I have attached the FASTA comparison here. I also tried to compare the SNPS in both but did not find any.Is it possible for you to explain. Also will I find similar RSxxxxx numbers if there are similar SNPs in both human and dogs in CD48 sequence or there are other ways to identify them.
Also, I requested the author of the study to give me exact location of SNPs in CD48 for dogs and she has sent them to me, but they are canFam3 coordinates. I am not sure how can I compare those markers in CD48 of a dog to the same sequence in humans.
Any help is appreciated.
- Attachments
-
- FASTA results.pdf
- (208.05 KiB) Downloaded 139 times

