Talk:Single-nucleotide polymorphism

From Wikipedia, the free encyclopedia
Jump to: navigation, search
          This article is of interest to the following WikiProjects:
WikiProject Genetics (Rated C-class, High-importance)
WikiProject icon This article is within the scope of WikiProject Genetics, a collaborative effort to improve the coverage of Genetics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 High  This article has been rated as High-importance on the project's importance scale.
WikiProject Molecular and Cellular Biology (Rated C-class, Mid-importance)
WikiProject icon This article is within the scope of the WikiProject Molecular and Cellular Biology. To participate, visit the WikiProject for more information.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
WikiProject Human Genetic History (Rated C-class, Low-importance)
WikiProject icon This article is within the scope of WikiProject Human Genetic History, a collaborative effort to improve the coverage of genetic genealogy, population genetics, and associated theory and methods articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the quality scale.
 Low  This article has been rated as Low-importance on the importance scale.

"Single nucleotide variation"[edit]

What happened to the page for this? Not all SNVs are SNPs, since polymorphism implies a greater plurality than a spontaneous mutation that may not generally occur in the population. Additionally, SNVs are the superset of SNPs that also includes single nucleotide indels. — Preceding unsigned comment added by (talk) 05:28, 19 June 2013 (UTC)

"fair use"[edit]

I'm new, and before charging into anything, I'd like to understand better what is considered "fair use" in the Wiki world. To me, this article exceeds "fair use" guidelines, as the information is straight offf of this page:

The source is not directly credited, with only an oblique attribution. Is this considered legit and consistent with Wiki guidelines?

--Daffyd 10:55, 27 October 2005 (UTC)

"Fair use" usually refers to a specific legal definition within copyright law. The content in question is provided by the US Government and is not copyrighted at all, so legally there's nothing wrong with copying it verbatim. Not that it's polite. --Mike Lin 17:27, 27 October 2005 (UTC)
Thanks for the clarification, makes sense. --Daffyd 17:55, 27 October 2005 (UTC)
It is unethical to use the material without crediting the source. If I had run across this in a student paper, the student would be charged with plagiarism. What is even worse is that the definition is wrong and misleading. The author(s) clearly don't know the meaning of polymorphism, and confuse cause and effect (e.g., "...a SNP might change the nucleotide sequence...." or "SNPs are generally considered to be a form of point mutation....").Ted 01:26, 20 January 2006 (UTC)
The only problem there is that the author(s) get(s) SNP confused with what would be true of a UEP. Nagelfar 01:16, 14 March 2007 (UTC)

In science, sources must *always* be sited. In addition, it is important not to copy things when you do not understand them fully. Generally, if you don't understand the material well enough to write your own description, you shouldn't just copy the information here. Thank you for contributing, but more care should be exercised since some of the information was not interpreted correctly so was misrepresented. I corrected a few of the errors but it would take a while to fix so I will try to return when I have more time. But I wonder, how many readers came to this page and walked away with wrong information in the interim? Ed 27 Jan 06

Folks, Wikipedia is not a scientific publication. The purpose of an encyclopedia is not the claim and exposition of our original work, but merely the accurate presentation of fact. The article in question is in the public domain and, though flawed, was written for this same purpose, so its copying here, from a reasonably trustworthy source, is in fact appropriate.

It is a separate issue that the original article was itself written with somewhat sloppy language, and certainly we will improve upon it.

Finally, I'd point out that it is quite rare for encyclopedia articles (even those on scientific subjects) to cite sources beyond the level of "for further reading", so many Wikipedia articles are already quite extraordinary in this respect.

--Mike Lin 07:10, 29 January 2006 (UTC)

Plagiarism is a problem in all areas, not just scientific publications. As a stub, reproductions of material from other sources is OK if it is properly cited. Wikipedia is an academic publication in the broad sense of both 'academic' and 'publication'. Ted 13:46, 30 January 2006 (UTC)

  • Daffyd, in answer to your question, yes, this is considered "legit and consistent" because the source website is linked in the "References" section, and the use does not violate copyright law. -- Reinyday, 17:44, 30 January 2006 (UTC)

Correction: allele[edit]

Quoting from this article - "For example, two sequenced DNA fragments from different individuals, AAGCCTA to AAGCTTA, contain a difference in a single nucleotide. In this case we say that there are two alleles : C and T."

This definition of allele is incorrect. C and T comprise a nucleotide pair. DNA molecules are chains of nucleotide pairs. Many nucleotide pairs are not located within a genetic locus. Here is the definition of allele from the glossary section of the Human Genome Project Information website <>

Allele: Alternative form of a genetic locus; a single allele for each locus is inherited from each parent (e.g., at a locus for eye color the allele might result in blue or brown eyes).

Lgfree 01:52, 17 June 2006 (UTC)lgfree

The rub is that SNPs are not related to loci in the classical sense. SNPs can be located within (expressed) genes (possibly giving rise to alleles) or between genes -- it makes no difference for the SNP. Or, if you want, because we are simply looking at single nucleotides, every nucleotide is a possible "marker locus." In that case, differences between alternate forms of that single nucleotide marker locus take on the four possible nucleotides. I'd hate to put something like that in the lead paragraph. Maybe it can be explained better somewhere else in the article. Ted 02:40, 17 June 2006 (UTC)

Thinking along these lines, it may be better to use the word variant, rather than allele. gringer 06:12, 9 July 2007 (UTC)

Definition query[edit]

I'm not sure if this statement is correct:

Almost all common SNPs have only two alleles

My understanding is that there are only two possible options (alleles) at a single nucleotide locus. If I'm wrong, would it be useful to have an example of a SNP that has more than two alleles?

Buzwad 11:59, 22 March 2007 (UTC)

They're quite difficult to find (mostly because most high-throughput genotyping assays don't work well with non-dimorphic SNPs), but they do exist. Here's a link to one that seems to have a fairly large amount of validation: gringer 01:17, 6 July 2007 (UTC)
Except I've just noticed that the reported frequencies do not include A/T, so it may not be a "real" SNP of this nature. Regardless, it indicates that people are allowing for the possibility of more than two alleles at a single nucleotide locus. gringer 01:23, 6 July 2007 (UTC)
rs332 is a tri allelic small indel, close enough to being a snp for the ncbi (dbsnp rs332). It is also the cause of 50% of Cystic Fibrosis cases . Cariaso 16:48, 19 August 2007 (UTC)
rs3091244 is a legitimate triallelic snp ncbi rs3091244 Cariaso 16:09, 10 October 2007 (UTC)
This is contradictory:

This is simply the lesser of the two allele frequencies for single-nucleotide polymorphisms.

'Almost all ... have only two' vs. 'the lesser of the two alleles for single-nucleotide polymorphisms' seems to be contradictory. —Preceding unsigned comment added by (talk) 17:57, 17 February 2009 (UTC)

SNP vs Single Nucleotide Polymorphism[edit]

In most academic publications, abbreviations are acceptable in the text as long as they have been defined previously. Given the short form (SNP, pronounced "snip") is significantly quicker to say than the expanded form (Single nucleotide polymorphism), and the abbreviation has been stated in the first sentence, wouldn't it be reasonable to use SNP for subsequent uses in this article? I notice that there have been two instances in the revision history where there has been a change from the abbreviated form to the expanded form (or vice versa). I'm mostly mentioning this because has changed the text back to the expanded form without noticing the previous history comment by on 17 April.

gringer 01:56, 12 July 2007 (UTC)

I would say it is ok to use "SNP" once the abbreviation has been stated in the first sentence. — fnielsen (talk) 08:24, 4 September 2008 (UTC)

Possible vandalism?[edit]

I just reverted this because it was an anonymous IP's only, unexplained edit which changed a word to its opposite. Could someone make sure it was vandalism? (talk) 15:50, 10 January 2008 (UTC)

Thanks, The idea of a bot which randomly changes words to their antonyms will now give me nightmares. In this case I agree with the anonymous editor. This 1% definition seems to have crept in due to the hapmap project. The largest hapmap populations were 120 people, so if a variation occurred in less than 1% of a population it was unlikely to be discovered during the hapmap. There doesn't seem to be any basis for the 1% rule, except the costs and limitation of the technologies of the day. A single nucleotide change which occurs for 1 person in 1000 should still be considered a snp.

  • "a minor allele frequency of less than or equal to 1%" would probably have been overlooked
  • "a minor allele frequency of greater than or equal to 1%" is definitely a snp.

Cariaso (talk) 05:50, 30 January 2008 (UTC)

Missense mutation[edit]

How does this relate to the "missense mutation" article. This layman needs clarification. —Preceding unsigned comment added by (talk) 12:32, 11 August 2008 (UTC)

This concept was not explained and should be: A missense mutation is a nonsynonymous mutation. I have now made a section title "Types of SNPs" and expanded it with a infobox. There is room for improvements. — fnielsen (talk) 08:28, 4 September 2008 (UTC)

At the risk of stating the obvious, SNPs most often are a result of a point mutation in one of ancestors' germline cells.

The genomic distribution of SNPs is not homogenous; SNPs usually occur in non-coding regions more frequently than in coding regions or, in general, where natural selection is acting and fixating the allele of the SNP that constitutes the most favorable genetic adaptation.

There is an interesting question why, in general, SNPs resulting from both synonymous (silent) and missense mutations should "usually occur in non-coding regions more frequently than in coding regions". Is it linked to variance of point mutation rates in euchromatin and heterochromatin (nucleobase tautomerization, their spontaneous or oxidative deamination, or pyrimidine dimerization?), the varying effectiveness of DNA repair mechanisms (like base excision repair), their combination, or perhaps something else? One could guess that rather not because of natural selection or recombination, in the case of synonymous mutations not affecting the rate of splicing errors. There is a possibility that a more generalized process, similar to repeat induced point-mutation (or RIP) could account for the disparity between coding and non-coding regions mentioned above. (talk) 12:08, 30 December 2012 (UTC)

More Info[edit]

I see that there is nothing said about how many SNPs have currently been mapped in humans. I would like to see a short table of species and how many SNPs are currently known in each... (This would need to be date stamped though.) See: dbSNP summary
--Jahibadkaret (talk) 16:22, 3 September 2008 (UTC)

Reference for 4Qk10 SNP?[edit]

This article mentions a SNP called 4Qk10 that should be common in Ashkenazi Jews and rare in Cubans. I can't find any references to back this up - neither dbSNP nor Google can clarify this. Does anyone have a reference for this?

Stinusl (talk) 12:05, 25 November 2008 (UTC)

Agreed. That name does not follow common patterns for naming snps. Furthermore while Ashkenazi populations are well researched, this is far less common for Cubans. The statement was added by an IP with no previous history of edits. I encourage the next interested party to remove the text. Cariaso (talk) 07:50, 26 November 2008 (UTC)
done. Cariaso (talk) 20:46, 28 November 2008 (UTC)


I believe in the more traditional way of using hyphens. They got lax about teaching it so long ago that even most professors no longer use it habitually, but magazines and newspapers do, so everybody still understands it, and that makes it possible to save it, and its worth saving for reasons noted below.

In some cases it disambiguates and adherence to a single style is preferable. One may, for example, omit a question mark at the end of many questions and still expect the reader to recognize them as questions, but in some cases the question mark conveys substantial information, and adhering to a single style—that of always including it—is therefore better than adhering to a style that always excludes it.

This is about polymorphisms involving a single nucleotide, NOT about "nucleotide polymorphisms" that are single. In this case, such a disambiguation may be like adding a question mark to a sentence that people would already have recognized as a question, but adhering to the style helps maintain the habit so that it's there for cases where it helps more than that.

See also hyphen for more on this.

That is why I changed the article's title, adding the hyphen. Michael Hardy (talk) 19:38, 28 December 2008 (UTC)

a SNP / an SNP[edit]

grammar is not my forte but I feel this change is not correct. When I read the text out loud I hear "a snip" not "an S-N-P". Cariaso (talk) 15:53, 22 April 2009 (UTC)

  • I do too head snip, and corrected/changed it. — fnielsen (talk) 20:00, 22 April 2009 (UTC)

The word Polymorphism[edit]

Molecular biologists use the word "polymorphism" quite differently from the rest of biologists. By the most common biological meaning of the word, polymorphism refers to the existence of clearly diffentiated phenotypic groups in the same population of a species. Molecular biologists use this word to talk about DNA sequence changes, regardless of the phenotype.
I think this can be very confusing to the general reader, so the terminological disctintion should be clearly noted on the article. This also applies to all the articles with the "molecular meaning" of polymorphism. It's already done in the Polymorphism article.--Earrnz (talk) 01:48, 4 May 2009 (UTC)

"Phenotype" is variable at the molecular level, but I think molecular biologists might be misusing the phrase SNP also. A SNP is by definition a polymorphism, not a mutation. People with SNP's may not appear any different on the outside but in reality, the "phenotype" for that gene may be different in that it has an extra amino acid added to the catalytic site, etc, in such a way that it's biological activity is different. We don't really have the technology yet to elucidate whether SNP's make such subtle, yet selectable differences on allelic phenotypes -- for example, a SNP that reduced the catalytic activity of an enzyme to 90%.
As far as "polymorphisms" are concerned, I think the article needs to better define this point. The average human cell has hundreds of thousands of random single-nucleotide mutations. These should not be confused with SNPs, which are true polymorphisms. In other words, to be a SNP it has to occur at >1% of the population. Different races have variations of SNP's, but within an ethnic group, most SNP's are consistent. The implications of being polymorphisms as opposed to random mutations is very important and I think the article needs to add the "greater than 1%" part of the definition in order to make this clear. JohnnyCalifornia

Example Section[edit]

Firstly, the Example section needs work. Secondly, what do we consider a "valid example" of a SNP? Annotated SNPs? Notable SNPs?.. Geno-Supremo (talk) 16:17, 21 May 2009 (UTC)

Grammar error in first sentence[edit]

A single-nucleotide polymorphism (SNP, pronounced snip) is a DNA sequence variation occurring when a single nucleotide — A, T, C, or G — in the genome (or other shared sequence) that differs between members of a species (or between paired chromosomes in an individual).

Should this read:

A single-nucleotide polymorphism (SNP, pronounced snip) is a DNA sequence variation occurring when a single nucleotide — A, T, C, or G — in the genome (or other shared sequence) differs between members of a species (or between paired chromosomes in an individual).


Number of SNPs in TAS2R38 (Examples section)[edit]

The reference I snarfed from the TASR2R38 article for the PTC tasting and number of SNPs only mentions 3 SNPs, not 6, but it could be that it's only talking about SNPs that affect PTC tasting, not all the SNPs in that gene. (Assuming there's a difference.) Being mostly clueless, I won't touch that bit. The Crab Who Played With The Sea (talk) 03:56, 24 March 2010 (UTC)

Dr. Steve Ligget[edit] . This unsourced statement makes the first sentence even harder to read. Proposed for rapid removal. Cariaso (talk) 00:57, 5 May 2010 (UTC)

Google suggests the name should be Stephen B. Liggett (we appear to have no article). I agree that it is just noise in the lead, and since the lead is supposed to be a summary of the article (which has no mention of the name), I support removal. An edit summary might be "unsourced, see talk". Johnuniq (talk) 01:58, 5 May 2010 (UTC)

The picture[edit]

I would consider changing the colours of the picture. I don't like the two strands having tgnbvhjgvbhe same colour as adenine and thymine nucleotides. (talk) 10:24, 27 August 2010 (UTC)

Definition in first line of the article is too narrow[edit]

Quote: "A single-nucleotide polymorphism (SNP, pronounced snip; plural snips) is a DNA sequence variation occurring when a single nucleotide — A, T, C or G — in the genome (or other shared sequence) differs between members of a biological species or paired chromosomes in a human"

The end part of this definition is awkward "differs between members of a biological species or paired chromosomes in a human" sounds like SNPs are only defined as differences between paired chromosomes if they are in humans. Also humans are biological species so it's an awkward distinction to have here in the definition. It's ambiguous at best...