SHLD1: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Ranki071 (talk | contribs)
No edit summary
Ranki071 (talk | contribs)
No edit summary
Line 23: Line 23:
The most common transcript encodes a protein that is 205 amino acids long with a molecular mass of 23 kDa<ref>{{Cite web|url=https://www.genecards.org/cgi-bin/carddisp.pl?gene=C20orf196|title=C20orf196 Gene - GeneCards {{!}} CT196 Protein {{!}} CT196 Antibody|last=Database|first=GeneCards Human Gene|website=www.genecards.org|access-date=2018-02-20}}</ref>. It has an [[isoelectric point]] of 4.72. It is predicted to have a [[half-life]] around 30 hours<ref>{{Cite journal|last=Bachmair|first=A|last2=Finley|first2=D|last3=Varshavsky|first3=A|date=October 10 1986|title=In vivo half-life of a protein is a function of its amino-terminal residue|url=|journal=Science|volume=234(4773)|pages=179-186|via=}}</ref>. It is predicted to localize in the [[Cell nucleus|nucleus]]<ref name=":1" />.
The most common transcript encodes a protein that is 205 amino acids long with a molecular mass of 23 kDa<ref>{{Cite web|url=https://www.genecards.org/cgi-bin/carddisp.pl?gene=C20orf196|title=C20orf196 Gene - GeneCards {{!}} CT196 Protein {{!}} CT196 Antibody|last=Database|first=GeneCards Human Gene|website=www.genecards.org|access-date=2018-02-20}}</ref>. It has an [[isoelectric point]] of 4.72. It is predicted to have a [[half-life]] around 30 hours<ref>{{Cite journal|last=Bachmair|first=A|last2=Finley|first2=D|last3=Varshavsky|first3=A|date=October 10 1986|title=In vivo half-life of a protein is a function of its amino-terminal residue|url=|journal=Science|volume=234(4773)|pages=179-186|via=}}</ref>. It is predicted to localize in the [[Cell nucleus|nucleus]]<ref name=":1" />.


=== Conserved Domains ===
=== Domains ===
C20orf196 contains one domain, DUF4521, which arose in [[Amniote]]. The proteins of this family are functionally uncharacterized. Several regions are conserved in mammals as well as amphibians and fish.
C20orf196 contains one domain, DUF4521, which arose in [[Amniote]]. The proteins of this family are functionally uncharacterized. Several regions are conserved in mammals as well as amphibians and fish.


=== Post-Translational Modifications ===
=== Post-Translational Modifications ===
There are many [[Protein phosphorylation|phosphorylation sites]] targeted by unspecified [[Serine kinase|serine kinases]].<ref>{{Cite journal|last=Blom|first=Nikolaj|last2=Gammeltoft|first2=Steen|last3=Brunak|first3=Søren|date=December 1999|title=Sequence and structure-based prediction of eukaryotic protein phosphorylation sites|url=http://linkinghub.elsevier.com/retrieve/pii/S0022283699933107|journal=Journal of Molecular Biology|volume=294|issue=5|pages=1351–1362|doi=10.1006/jmbi.1999.3310|issn=0022-2836}}</ref>
There are many [[Protein phosphorylation|phosphorylation sites]] targeted by unspecified [[Serine kinase|serine kinases]].<ref>{{Cite journal|last=Blom|first=Nikolaj|last2=Gammeltoft|first2=Steen|last3=Brunak|first3=Søren|date=December 1999|title=Sequence and structure-based prediction of eukaryotic protein phosphorylation sites|url=http://linkinghub.elsevier.com/retrieve/pii/S0022283699933107|journal=Journal of Molecular Biology|volume=294|issue=5|pages=1351–1362|doi=10.1006/jmbi.1999.3310|issn=0022-2836}}</ref> C20orf196 is predicted to have one [[Sumoylation|SUMOylation]] site at amino acid 203 and one [[Glycosylation|N-glycosylation]] site at amino acid 69.<ref>{{Cite journal|last=Zhao|first=Qi|last2=Xie|first2=Yubin|last3=Zheng|first3=Yueyuan|last4=Jiang|first4=Shuai|last5=Liu|first5=Wenzhong|last6=Mu|first6=Weiping|last7=Liu|first7=Zexian|last8=Zhao|first8=Yong|last9=Xue|first9=Yu|date=2014-05-31|title=GPS-SUMO: a tool for the prediction of sumoylation sites and SUMO-interaction motifs|url=https://doi.org/10.1093/nar/gku383|journal=Nucleic Acids Research|language=en|volume=42|issue=W1|pages=W325–W330|doi=10.1093/nar/gku383|issn=1362-4962|pmc=PMC4086084|pmid=24880689}}</ref><ref>{{Cite journal|last=Gupta|first=R|last2=Jung|first2=E|last3=Brunak|first3=Søren|date=|title=Prediction of N-glycosylation sites in human proteins|url=|journal=DTU Bioinformatics|volume=46|pages=203-206|via=}}</ref> C20orf196 is predicted to have two [[ubiquitination]] sites at amino acids 84 and 139.<ref>{{Cite journal|last=Huang|first=Chien-Hsun|last2=Su|first2=Min-Gang|last3=Kao|first3=Hui-Ju|last4=Jhong|first4=Jhih-Hua|last5=Weng|first5=Shun-Long|last6=Lee|first6=Tzong-Yi|date=2016-01-11|title=UbiSite: incorporating two-layered machine learning method with substrate motifs to predict ubiquitin-conjugation site on lysines|url=https://doi.org/10.1186/s12918-015-0246-z|journal=BMC Systems Biology|volume=10|issue=1|pages=S6|doi=10.1186/s12918-015-0246-z|issn=1752-0509|pmc=PMC4895383|pmid=26818456}}</ref>


=== Secondary Structure ===
=== Secondary Structure ===
Several modeling programs predicted a secondary structure containing alpha helix, beta strand, and coil regions.<ref>{{Cite journal|last=Zhang|first=Yang|date=2008-01-23|title=I-TASSER server for protein 3D structure prediction|url=https://doi.org/10.1186/1471-2105-9-40|journal=BMC Bioinformatics|volume=9|pages=40|doi=10.1186/1471-2105-9-40|issn=1471-2105|pmc=PMC2245901|pmid=18215316}}</ref><ref>{{Cite web|url=http://crdd.osdd.net/raghava/apssp/|title=APSSP: Advanced Protein Secondary Structure Prediction Server|last=Raghava|first=G. P. S.|date=2000|website=|archive-url=|archive-date=|dead-url=|access-date=}}</ref> CFSSP has predicted that C20orf196 secondary structure is 57.1% alpha helices, 48.8% beta strands, and 16.6% beta turns.<ref>{{Cite journal|last=T|first=Ashok Kumar,|date=2013-04-01|title=CFSSP: Chou and Fasman Secondary Structure Prediction server|url=https://doi.org/10.5281/zenodo.50733|journal=Zenodo|language=en|doi=10.5281/zenodo.50733}}</ref>


=== Protein Interations ===
=== Protein Interations ===

Revision as of 20:44, 6 May 2018

C20orf196 is a protein which in humans is encoded by the C20orf196 gene[1]. The C20orf196 gene encodes an mRNA that is 1,151 base pairs long, and a protein that is 205 amino acids long[1]. Its aliases are RINN3 and SHLD1. Currently, its official name is SHLD1 or shieldin complex subunit 1.

SHLD1
Identifiers
AliasesSHLD1, chromosome 20 open reading frame 196, shieldin complex subunit 1, RINN3, C20orf196
External IDsOMIM: 618028 MGI: 1920997 HomoloGene: 51865 GeneCards: SHLD1
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001303477
NM_001303478
NM_001303479
NM_152504

NM_028637
NM_001358260
NM_001358261

RefSeq (protein)

NP_001290406
NP_001290407
NP_001290408
NP_689717

NP_082913
NP_001345189
NP_001345190

Location (UCSC)Chr 20: 5.75 – 5.86 MbChr 2: 132.53 – 132.59 Mb
PubMed search[4][5]
Wikidata
View/Edit HumanView/Edit Mouse

Function

C20orf196 is involved in the DNA repair network. Gupta et al. identified C20orf196 as part of a vertebrate-specific protein complex called shieldin[6]. Shieldin is recruited to double stranded breaks (DSB) to promote nonhomologous end joining-dependent repair (NHEJ), immunoglobulin class-switch recombination (CSR), and fusion of unprotected telomeres[6]. Analysis indicates a sub-stoichiometric interaction or weaker interaction affinity of SHLD1 to the shieldin complex.

Gene

Location

C20orf196 is located on the short arm of chromosome 20 at 20p12.3, from base pairs 5,750,286 to 5,864,407 on the direct strand[1]. It contains 11 exons[7].

mRNA

C20orf196 produces 9 different mRNAs, with 7 alternatively spliced variants and 2 unspliced forms[7]. There are 3 probable alternative promoters, 3 non-overlapping alternative last exons, and 2 alternative polyadenylation sites[7]. The mRNAs differ by the truncation of the 5' end, truncation of the 3' end, presence or absence of 2 cassette exons, and overlapping exons with different boundaries[7].

Expression

RNA-Seq analysis has shown ubiquitous expression of c20orf196 in 26 human tissues: adrenal, appendix, bone marrow, brain, colon, duodenum, endometrium, esophagus, fat, gall bladder, heart, kidney, liver, lung, lymph node, ovary, pancreas, placenta, prostate, salivary gland, skin, small intestine, spleen, stomach, testis, thyroid, and urinary bladder[1]. The highest C20orf196 mRNA levels were found in the lymph node, tonsil, thyroid, adrenal gland, prostate, pharynx, parathyroid, connective tissue, and bone marrow[8].

C20orf196 was found to be expressed in soft tissue/muscle tissue tumors, lymphoma tumors, and pancreatic tumors[9]. C20orf196 representation was biased toward the fetal developmental stage[9]. EBI expression data showed high expression of C20orf196 in the diencephalon and cerebral cortex in the developing brain[9].

Promoter

The promoter region is within bases 5749286 to 5750555, totaling 1270 base pairs. The transcription start site is located within bases 5750382 and 5750409, totaling 28 base pairs.

Protein

The most common transcript encodes a protein that is 205 amino acids long with a molecular mass of 23 kDa[10]. It has an isoelectric point of 4.72. It is predicted to have a half-life around 30 hours[11]. It is predicted to localize in the nucleus[7].

Domains

C20orf196 contains one domain, DUF4521, which arose in Amniote. The proteins of this family are functionally uncharacterized. Several regions are conserved in mammals as well as amphibians and fish.

Post-Translational Modifications

There are many phosphorylation sites targeted by unspecified serine kinases.[12] C20orf196 is predicted to have one SUMOylation site at amino acid 203 and one N-glycosylation site at amino acid 69.[13][14] C20orf196 is predicted to have two ubiquitination sites at amino acids 84 and 139.[15]

Secondary Structure

Several modeling programs predicted a secondary structure containing alpha helix, beta strand, and coil regions.[16][17] CFSSP has predicted that C20orf196 secondary structure is 57.1% alpha helices, 48.8% beta strands, and 16.6% beta turns.[18]

Protein Interations

Several databases citing yeast two-hybrid screenings have found C20orf196 to interact with PRMT1, QARS, MAD2L2, and CUL3.[19][20] [21][22] C20orf196 functionally interacts with REV7, SHLD2, and SHLD3 in the shieldin complex within the DNA repair network.[6]

Homology and Evolution

C20orf196 gene homologs are found in mammals, birds, reptiles, and amphibians[23]. C20orf196 has distant orthologs in bony fish and cartilaginous fish[23]. There are no invertebrate orthologs. Orthologs are found in 163 organisms[1]. There are no paralogs in humans.

Table of Orthologs for C20orf196
Class Species Common Name Date of Divergence (MYA) Accession Number Sequence Identity (%) Sequence Similarity (%)
Mammalia (Marsupialia) Sarcophilus harrisii Tasmanian devil 159 XP_012395605.1 55 68
Phascolarctos cinereus Koala 159 XP_020841153.1 54 67
Aves Gallus gallus Red junglefowl 312 XP_015139412.1 33 49
Aptenodytes forsteri Emperor penguin 312 XP_009280865.1 35 47
Reptilia Crocodylus porosus Saltwater crocodile 312 XP_019404613.1 36 50
Pogona vitticeps Central bearded dragon 312 XP_020649300.1 30 46
Thamnophis sirtalis Common garter snake 312 XP_013911941.1 33 51
Amphibia Nanorana parkeri High Himalaya frog 352 XP_018422019.1 39 57
Osteichthyes Monopterus albus Asian swamp eel 435 XP_020455013.1 46 73
Chondrichthyes Rhincodon typus Whale shark 473 XP_020391945.1 30 55
Figure illustrating the evolution rate for C20orf196 in twenty orthologs as compared to the fast-evolving protein, fibrinogen, and slow-evolving protein, cytochrome C.

Rate of Evolution

C20orf196 has a high protein sequence divergence rate. It is a fast evolving protein. It evolves faster than fibrinogen, as seen in the figure to the right.

Phenotype

Genome-wide association studies have identified SNPs found in the C20orf196 gene that are associated with parental longevity, information processing speed, and breast carcinoma occurrence[24].

  1. ^ a b c d e "C20orf196 chromosome 20 open reading frame 196 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2018-02-05.
  2. ^ a b c GRCh38: Ensembl release 89: ENSG00000171984Ensembl, May 2017
  3. ^ a b c GRCm38: Ensembl release 89: ENSMUSG00000044991Ensembl, May 2017
  4. ^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. ^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  6. ^ a b c Gupta, Rajat; Somyajit, Kumar; Narita, Takeo; Maskey, Elina; Stanlie, Andre; Kremer, Magdalena; Typas, Dimitris; Lammers, Michael; Mailand, Niels (2018-05). "DNA Repair Network Analysis Reveals Shieldin as a Key Regulator of NHEJ and PARP Inhibitor Sensitivity". Cell. 173 (4): 972–988.e23. doi:10.1016/j.cell.2018.03.050. ISSN 0092-8674. {{cite journal}}: Check date values in: |date= (help)
  7. ^ a b c d e mieg@ncbi.nlm.nih.gov, Danielle Thierry-Mieg and Jean Thierry-Mieg, NCBI/NLM/NIH,. "AceView: Gene:C20orf196, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". www.ncbi.nlm.nih.gov. Retrieved 2018-02-05.{{cite web}}: CS1 maint: extra punctuation (link) CS1 maint: multiple names: authors list (link)
  8. ^ Uhlén, Mathias; Fagerberg, Linn; Hallström, Björn M.; Lindskog, Cecilia; Oksvold, Per; Mardinoglu, Adil; Sivertsson, Åsa; Kampf, Caroline; Sjöstedt, Evelina (2015-01-23). "Tissue-based map of the human proteome". Science. 347 (6220): 1260419. doi:10.1126/science.1260419. ISSN 0036-8075. PMID 25613900.
  9. ^ a b c "The European Bioinformatics Institute < EMBL-EBI". 2018. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  10. ^ Database, GeneCards Human Gene. "C20orf196 Gene - GeneCards | CT196 Protein | CT196 Antibody". www.genecards.org. Retrieved 2018-02-20.
  11. ^ Bachmair, A; Finley, D; Varshavsky, A (October 10 1986). "In vivo half-life of a protein is a function of its amino-terminal residue". Science. 234(4773): 179–186. {{cite journal}}: Check date values in: |date= (help)
  12. ^ Blom, Nikolaj; Gammeltoft, Steen; Brunak, Søren (December 1999). "Sequence and structure-based prediction of eukaryotic protein phosphorylation sites". Journal of Molecular Biology. 294 (5): 1351–1362. doi:10.1006/jmbi.1999.3310. ISSN 0022-2836.
  13. ^ Zhao, Qi; Xie, Yubin; Zheng, Yueyuan; Jiang, Shuai; Liu, Wenzhong; Mu, Weiping; Liu, Zexian; Zhao, Yong; Xue, Yu (2014-05-31). "GPS-SUMO: a tool for the prediction of sumoylation sites and SUMO-interaction motifs". Nucleic Acids Research. 42 (W1): W325–W330. doi:10.1093/nar/gku383. ISSN 1362-4962. PMC 4086084. PMID 24880689.{{cite journal}}: CS1 maint: PMC format (link)
  14. ^ Gupta, R; Jung, E; Brunak, Søren. "Prediction of N-glycosylation sites in human proteins". DTU Bioinformatics. 46: 203–206.
  15. ^ Huang, Chien-Hsun; Su, Min-Gang; Kao, Hui-Ju; Jhong, Jhih-Hua; Weng, Shun-Long; Lee, Tzong-Yi (2016-01-11). "UbiSite: incorporating two-layered machine learning method with substrate motifs to predict ubiquitin-conjugation site on lysines". BMC Systems Biology. 10 (1): S6. doi:10.1186/s12918-015-0246-z. ISSN 1752-0509. PMC 4895383. PMID 26818456.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  16. ^ Zhang, Yang (2008-01-23). "I-TASSER server for protein 3D structure prediction". BMC Bioinformatics. 9: 40. doi:10.1186/1471-2105-9-40. ISSN 1471-2105. PMC 2245901. PMID 18215316.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  17. ^ Raghava, G. P. S. (2000). "APSSP: Advanced Protein Secondary Structure Prediction Server". {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)
  18. ^ T, Ashok Kumar, (2013-04-01). "CFSSP: Chou and Fasman Secondary Structure Prediction server". Zenodo. doi:10.5281/zenodo.50733.{{cite journal}}: CS1 maint: extra punctuation (link) CS1 maint: multiple names: authors list (link)
  19. ^ Szklarczyk, Damian; Franceschini, Andrea; Wyder, Stefan; Forslund, Kristoffer; Heller, Davide; Huerta-Cepas, Jaime; Simonovic, Milan; Roth, Alexander; Santos, Alberto (2014-10-28). "STRING v10: protein–protein interaction networks, integrated over the tree of life". Nucleic Acids Research. 43 (D1): D447–D452. doi:10.1093/nar/gku1003. ISSN 1362-4962. PMC 4383874. PMID 25352553.{{cite journal}}: CS1 maint: PMC format (link)
  20. ^ Licata, Luana; Briganti, Leonardo; Peluso, Daniele; Perfetto, Livia; Iannuccelli, Marta; Galeota, Eugenia; Sacco, Francesca; Palma, Anita; Nardozza, Aurelio Pio (2011-11-16). "MINT, the molecular interaction database: 2012 update". Nucleic Acids Research. 40 (D1): D857–D861. doi:10.1093/nar/gkr930. ISSN 1362-4962. PMC 3244991. PMID 22096227.{{cite journal}}: CS1 maint: PMC format (link)
  21. ^ Hermjakob, Henning; Montecchi‐Palazzi, Luisa; Lewington, Chris; Mudali, Sugath; Kerrien, Samuel; Orchard, Sandra; Vingron, Martin; Roechert, Bernd; Roepstorff, Peter (2004-01-01). "IntAct: an open source molecular interaction database". Nucleic Acids Research. 32 (suppl_1): D452–D455. doi:10.1093/nar/gkh052. ISSN 0305-1048. PMC 308786. PMID 14681455.{{cite journal}}: CS1 maint: PMC format (link)
  22. ^ Calderone, Alberto; Castagnoli, Luisa; Cesareni, Gianni (2013-08). "mentha: a resource for browsing integrated protein-interaction networks". Nature Methods. 10 (8): 690–691. doi:10.1038/nmeth.2561. ISSN 1548-7091. {{cite journal}}: Check date values in: |date= (help)
  23. ^ a b Altschul, Stephen F.; Gish, Warren; Miller, Webb; Myers, Eugene W.; Lipman, David J. (October 1990). "Basic local alignment search tool". Journal of Molecular Biology. 215 (3): 403–410. doi:10.1016/s0022-2836(05)80360-2. ISSN 0022-2836.
  24. ^ "GWAS Catalog". 2018. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)