User:PurpleAmaranth/sandbox

From Wikipedia, the free encyclopedia

Comparative Genomics

Methods section

Comparative genomics has a variety of methods. Some of the simplest being the basic comparisons of genome size and gene density. Others include structural and functional genome annotation, detection of environmental selection, genetic disease mapping, etc. Usually these are based on computational methods like alignment, phylogenetic reconstruction, and coalescent theory.[1]

Phylogenetic tree of descendant species and reconstructed ancestors. The branch color represents breakpoint rates in RACFs (breakpoints per million years). Black branches represent nondetermined breakpoint rates. Tip colors depict assembly contiguity: black, scaffold-level genome assembly; green, chromosome-level genome assembly; yellow, chromosome-scale scaffold-level genome assembly. Numbers next to species names indicate diploid chromosome number (if known).[2]

Alignments can be used to capture information about the aligned sequences like ancestry, common evolutionary descent, or common structural function. Alignments can be done for both genetic and protein sequences. The specific order of nucleotides or amino acids within DNA or Protein molecules are respectively called DNA and protein sequences.[3] The Alignment score of two sequences is the value that measures the degree of sequence similarity. This is as opposed to the distance between sequences or the level of dissimilarity between sequences. The number of non-matching characters is called the Hamming distance.[4]Alignments consist of local, global pairwise alignments, and multiple sequence alignments. One way to find global alignments is to use a dynamic programming algorithm known as Needleman-Wunsch algorithm. This algorithm can be modified and used to find the local alignments.

Example of a phylogenetic tree created from an alignment of 250 unique spike protein sequences from the Betacoronavirus family.

Another computational method for comparative genomics was phylogenetic reconstruction. It is used to describe evolutionary relationships in terms of the most common recent ancestor of the organisms. The relationships are usually represented in a tree called a phylogenetic tree.  Similar to phylogenetic reconstructions, coalescent theory is a retrospective model. Typically it is used to trace all alleles of a gene in a sample from a population to a single ancestral copy shared by all members of the population. This is also known as the most recent common ancestor. Analysis based on coalescence theory tries predicting the amount of time between the introduction of a mutation and a particular allele or gene distribution in a population. This time period is equal to how long ago the most recent common ancestor existed. The inheritance relationships are visualized in a form similar to a phylogenetic tree. Coalescence (or the gene genealogy) can be visualized using dendrograms.

Example of synteny block and break. Genes located on chromosomes of two species are denoted in letters. Each gene is associated with a number representing the species they belong to (species 1 or 2). Orthologous genes are connected by dashed lines and genes without an orthologous relationship are treated as gaps in synteny programs.[5]
Solid  green squares  indicate  mammalian  chromosomes maintained as a single synteny block (either as a single  chromosome  or  fused  with  another MAM), with shades of the color indicating the fraction of the chromosome affected by intra-chromosomal  rearrangements  (the  lightest shade is most affected). Split blocks demarcate mammalian chromosomes affected by inter-chromosomal rearrangements. Upper (green)triangles show the fraction of the chromosome affected by intra chromosomal rearrangements, and lower (red) triangles show the fraction affected by inter chromosomal rearrangements. Syntenic relationships of each MAM to the human genome are given at the right of the diagram. MAMX appears split in goat because its X chromosome is assembled as two separate fragments. BOR, boreoeutherian ancestor chromosome; EUA, Euarchontoglires ancestor chromo-some; EUC, Euarchonta ancestor chromosome; EUT,  eutherian  ancestor  chromosome;  PMT; Primatomorpha ancestor chromosome; PRT, primates (Hominidae) ancestor chromosome; THE, therian ancestor chromosome.
Image from the study Evolution of the ancestral mammalian karyotype and syntenic regions. It is a Visualization of the evolutionary history of  reconstructed  mammalian  chromosomes based on the  human  lineage. [6]

An additional method in comparative genomics is genetic mapping. In genetic mapping, visualizing synteny is one way to see the preserved order of genes on chromosomes. It is usually used for chromosomes of related species, both of which result from a common ancestor.[7] This and other comparative genomic methods are useful as it can help shed light on evolutionary history and more. A recent study used comparative genomics to reconstruct 16 Ancestral  Karyotypes along the Mammalian Phylogeny. The computational reconstruction of ancestral mammalian karyotypes showed how chromosomes rearranged themselves during mammal evolution. It gave insight into conservation of select regions often associated with the control of developmental processes. In addition it helped provide additional understanding of chromosome evolution and genetic diseases associated with DNA rearrangements.

  1. ^ Haubold, Bernhard; Wiehe, Thomas (2004-09-01). "Comparative genomics: methods and applications". Naturwissenschaften. 91 (9): 405–421. doi:10.1007/s00114-004-0542-8. ISSN 1432-1904.
  2. ^ Damas, Joana; Corbo, Marco; Kim, Jaebum; Turner-Maier, Jason; Farré, Marta; Larkin, Denis M.; Ryder, Oliver A.; Steiner, Cynthia; Houck, Marlys L.; Hall, Shaune; Shiue, Lily; Thomas, Stephen; Swale, Thomas; Daly, Mark; Korlach, Jonas (2022-10-04). "Evolution of the ancestral mammalian karyotype and syntenic regions". Proceedings of the National Academy of Sciences. 119 (40): e2209139119. doi:10.1073/pnas.2209139119. ISSN 0027-8424. PMC 9550189. PMID 36161960.{{cite journal}}: CS1 maint: PMC format (link)
  3. ^ Altschul, Stephen F.; Pop, Mihai (2017), Rosen, Kenneth H.; Shier, Douglas R.; Goddard, Wayne (eds.), "Sequence Alignment", Handbook of Discrete and Combinatorial Mathematics (2nd ed.), Boca Raton (FL): CRC Press/Taylor & Francis, ISBN 978-1-58488-780-5, PMID 29206392, retrieved 2022-12-18
  4. ^ Prjibelski, Andrey D.; Korobeynikov, Anton I.; Lapidus, Alla L. (2019-01-01), Ranganathan, Shoba; Gribskov, Michael; Nakai, Kenta; Schönbach, Christian (eds.), "Sequence Analysis", Encyclopedia of Bioinformatics and Computational Biology, Oxford: Academic Press, pp. 292–322, doi:10.1016/b978-0-12-809633-8.20106-4, ISBN 978-0-12-811432-2, retrieved 2022-12-18
  5. ^ Liu, Dang; Hunt, Martin; Tsai, Isheng J (2018-12). "Inferring synteny between genome assemblies: a systematic evaluation". BMC Bioinformatics. 19 (1): 26. doi:10.1186/s12859-018-2026-4. ISSN 1471-2105. PMC 5791376. PMID 29382321. {{cite journal}}: Check date values in: |date= (help)CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  6. ^ Damas, Joana; Corbo, Marco; Kim, Jaebum; Turner-Maier, Jason; Farré, Marta; Larkin, Denis M.; Ryder, Oliver A.; Steiner, Cynthia; Houck, Marlys L.; Hall, Shaune; Shiue, Lily; Thomas, Stephen; Swale, Thomas; Daly, Mark; Korlach, Jonas (2022-10-04). "Evolution of the ancestral mammalian karyotype and syntenic regions". Proceedings of the National Academy of Sciences. 119 (40): e2209139119. doi:10.1073/pnas.2209139119. ISSN 0027-8424. PMC 9550189. PMID 36161960.{{cite journal}}: CS1 maint: PMC format (link)
  7. ^ Duran, Chris; Edwards, David; Batley, Jacqueline (2009). "Genetic maps and the use of synteny". Methods in Molecular Biology (Clifton, N.J.). 513: 41–55. doi:10.1007/978-1-59745-427-8_3. ISSN 1064-3745. PMID 19347649.