Jump to content

C1orf185

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Zanimum (talk | contribs) at 01:25, 3 May 2020. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Chromosome 1 open reading frame 185, also known as C1orf185, is a protein that in humans is encoded by the C1orf185 gene. In humans, C1orf185 is a lowly expressed protein that has been found to be occasionally expressed in the circulatory system[1][2].

Gene

C1orf185 is located on chromosome 1 in humans on the positive strand between bases 51,102,188 and 51,152,283[3]. There are 5 exons in the main splice isoform, however the number and selection of exons varies based on the isoform[3].

C1orf185 locus within the human genome. Diagrams from NCBI Genome Viewer[4] (top) and the Integrative Genomics Viewer[5] (bottom).


mRNA and Protein Isoforms

C1orf185 has 5 different splice isoforms in humans[3].

C1orf185 Transcripts
Isoform mRNA Accession Protein Accession Transcript Length (bp) Protein Length (AA)
uncharacterized protein C1orf185 NM_001136508.2 NP_001129980.1 921 199
uncharacterized protein C1orf185 isoform X1 XM_011541282.2 XP_011539584.1   787 195
uncharacterized protein C1orf185 isoform X2 XM_024446525.1 XP_024302293.1 586 116
uncharacterized protein C1orf185 isoform X3 XM_024446528.1 XP_024302296.1 420 116
uncharacterized protein C1orf185 isoform X4 XM_024446529.1 XP_024302297.1 367 107

Protein

C1orf185 is a member of the pfam15842 protein family, containing a domain of unknown function, DUF4718[6]. This family of proteins is between 130 and 224 amino acids long, and is found only in eukaryotes..

The main splice isoform of C1orf185 has a molecular weight of 22.4 kDa[7] and an isoelectric point of 7.67[8]. It contains a transmembrane domain spanning from positions 15 to 37[3]. There is also a conserved serine-rich region from S123 to S142, which could possibly indicate function as a "splicing activator"[9].

C1orf185 contains 3 primary subcellular domains: an extracellular domain which spans the amino acids from positions 1 to 14, a transmembrane domain from positions 15-37, and a large intracellular domain from positions 38-199[10].

Below are predicted secondary and tertiary structures of C1orf185, modeled using the Chou-Fasman[11] secondary structure prediction tool and the I-TASSER[12] protein structure and function prediction tool. Chou-Fasman predicts a mixture of α-helices, β-sheets, and other structural turns and coils, which can be seen modeled on the I-TASSER prediction.

Chou-Fasman Secondary Structure Prediction[11] (left) and I-TASSER Tertiary Structure Prediction[12] (right) for C1orf185.


Regulation of Expression

Gene Level Regulation

Below is a diagram showing the locations of predicted transcription factor binding sites in the C1orf185 promoter, along with a table describing the attributes of each individual binding site. The transcription factors were found and analyzed using the ElDorado tool from Genomatix[13].

Diagram of the C1orf185 with transcription factor binding sites annotated.
Diagram of the C1orf185 with transcription factor binding sites annotated.


Transcription Factor Binding Sites within the C1orf185 Promoter
Transcription Factor Detailed matrix info Matrix similarity Sequence +/-
VTATA.02 Mammalian C-type LTR TATA box 0.91 tgtcaTAAAaacattcc +
NKX25.05 Homeodomain factor Nkx-2.5/Csx 0.986 tttttTGAGtgaagtcttg -
CDX1.01 Intestine specific homeodomain factor CDX-1 0.988 ttgccctTTTAtgaaaaaa +
VTATA.02 Mammalian C-type LTR TATA box 0.914 tacttTAAAaataagca -
ERG.02 v-ets erythroblastosis virus E26 oncogene homolog 0.942 gtctcaaaGGAAaataaaaag -
SPI1.02 SPI-1 proto-oncogene; hematopoietic transcription factor PU.1 0.992 attaaagaGGAAgtctcaaag -
FHXB.01 Fork head homologous X binds DNA with a dual sequence specificity (FHXA and FHXB) 0.831 ttctaaATAAcacattt -
TGIF.01 TG-interacting factor belonging to TALE class of homeodomain factors 1 tctataaatGTCAatta +
ZNF219.01 Kruppel-like zinc finger protein 219 0.913 ctccaCCCCcgtcagcccaaagg +
ZBP89.01 Zinc finger transcription factor ZBP-89 0.956 catctccaCCCCcgtcagcccaa +
CREB.02 cAMP-responsive element binding protein 0.922 cctttgggcTGACgggggtgg -
FOXP1_ES.01 Alternative splicing variant of FOXP1, activated in ESCs 1 tcataaaAACAttccag -
VTATA.02 Mammalian C-type LTR TATA box 0.895 tgtcaTAAAaacattcc -
CREB1.02 cAMP-responsive element binding protein 1 0.949 tggaaGTGAtgtcataaaaac -
SPI1.02 SPI-1 proto-oncogene; hematopoietic transcription factor PU.1 0.979 atttgagtGGAAgtgatgtca -
NKX25.05 Homeodomain factor Nkx-2.5/Csx 0.994 gaattTGAGtggaagtgat -
MESP1_2.01 Mesoderm posterior 1 and 2 0.917 cagtCATAtggct +
MESP1_2.01 Mesoderm posterior 1 and 2 0.929 aagcCATAtgact -
DELTAEF1.01 deltaEF1 0.99 gcttcACCTaaag +
ERG.02 v-ets erythroblastosis virus E26 oncogene homolog 0.93 gaagaagaGGAAaatatattt +

Matrix similarity correlates to the confidence in the prediction for each individual binding sites. +/- correlates to presence on either the positive or negative strand. The transcription factors are listed in order of appearance from beginning to end of the promoter.

C1orf185 has a very low expression pattern, with the only site in the body showing any signs of expression being the circulatory system. Two NCBI GEO profiles have shown that C1orf185 was consistently overexpressed in whole blood samples within a group of postmenopausal women[14], as well as being somewhat overexpressed in the peripheral blood of Parkinson's patients compared to controls[15].

Transcript Level Regulation

Below is a figure produced by mfold[16] showing predicted mRNA structure of the 3' UTR of C1orf185.

Possible mRNA secondary structure of C1orf185 made by mfold[16]. There are 3 main branches that end in 1-2 stem loops each. The stem loop near the end of the sequence contains the Poly-A signal, which signals the end of transcription.


C1orf185 has one conserved miRNA binding site of type 7mer-A1 among several orthologs[17]. The presence of a 7mer-A1 binding site indicates that C1orf185 is likely to be post-transcriptionally repressed[18].

Possible conserved C1orf185 miRNA binding site details found using TargetScan[17].


Protein Level Regulation

Below is a figure and table showing predicted post-translational modification sites for C1orf185.

Sequence showing predicted post-translational modifications on the C1orf185 protein.
Table of Post-Translational Modifications for C1orf185
Type of Modification Tool Positions in Homo sapiens
Phosphorylation NetPhos[19] S61, S69, S104, S130, S142, S147, S165, S186
Glycation NetGlycate[20], NetNGlyc[21] K5, K50, K98, K113
O-GlcNAc YinOYang[22] T121, S122, S130

The presence of multiple leusine glycation sites indicate that there may be ways to deter the function of the protein, as glycation has been associated with the loss of protein function in blood vessels[23].

Clinical Significance

C1orf185 has been shown to play a role in the circulatory system, likely in a more reactive role, as it is lowly expressed across many species. It appears in studies surrounding atrial fibrillation[2] and abnormal QRS duration[1], which implies it may play a role in those circulatory diseases.

Homology

Below is a table showing C1orf185 orthologs across a variety of conserved species. Orthologs were found using NCBI BLAST[24], the dates of divergence were found using TimeTree[25], and the global sequence identities and similarities were found using the Clustal Omega multiple sequence alignment tool[26].

Ortholog Table for C1orf185.
Genus and Species Common Name Taxonomic Group Date of Divergence (MYA) Accession Number Sequence Length (aa) Sequence Identity (Global) Sequence Similarity (Global)
Homo sapiens Human Primates 0 NP_001129980.1 199 100% 100%
Pongo abelii Sumatran orangutan Primates 15.76 PNJ53823.1 195 93.50% 95.50%
Cebus capucinus imitator Capuchin Primates 43.2 XP_017404303.1 229 77.00% 79.60%
Galeopterus variegatus Sunda flying lemur Dermoptera 76 XP_008578352.1 203 73.70% 77.90%
Oryctolagus cuniculus Rabbit Lagomorpha 90 XP_008263491.1 225 69.90% 76.40%
Dipodomys ordii Ord's kangaroo rat Rodentia 90 XP_012877642.1 188 52.20% 59.40%
Mastomys coucha Southern multimammate mouse Rodentia 90 XP_031234037 263 51.50% 61.50%
Mus musculus House mouse Rodentia 90 NP_001186019.1 226 47.40% 59.50%
Peromyscus leucopus White-footed mouse Rodentia 90 XP_028745885.1 295 41% 48.20%
Phyllostomus discolor Pale spear-nosed bat Chiroptera 96 XP_028367083.1 191 73.40% 80.40%
Myotis davidii David's myotis Chiroptera 96 XP_006768446.1 196 71.40% 78.40%
Equus caballus Horse Perissodactyla 96 XP_023485921.1 243 63.80% 68.30%
Muntiacus muntjak Indian muntjac Artiodactyla 96 KAB0362285.1 200 59.40% 65.90%
Hipposideros armiger Great roundleaf bat Chiroptera 96 XP_019487867.1 157 54.90% 59.20%
Tursiops truncatus Bottlenose dolphin Artiodactyla 96 XP_033708766.1 189 54.10% 59.00%
Sarcophilus harrisii Tasmanian devil Dasyuromorhpia 159 XP_031825005.1 333 18.20% 27.70%
Ornithorhynchus anatinus Platypus Monotremata 180 XP_028902271 309 26.80% 37.40%
Pelodiscus sinensis Chinese softshell turtle Reptilia 312 XP_025042106.1 890 7.40% 11.40%
Gopherus evgoodei Sinaloan thornscrub tortoise Reptilia 312 XP_030429802.1 777 4.00% 6.30%
Chrysemys picta bellii Western painted turtle Reptilia 312 XP_023960730.1 748 3.70% 5.80%

Compared to other genes, C1orf185 appears to be evolving and changing relatively quickly, as it is only conserved in mammals and a few turtles, and more distant mammals have quite distant similarities. Primates are the only taxonomic group that heavily conserves this gene with regards to the human sequence, while other mammals and turtles only heavily conserve the transmembrane domain (positions 15-37). As primates and mammals are warm-blooded, this may further support the evidence showing a possible role in the circulatory system.

References

  1. ^ a b Sotoodehnia, Nona; Isaacs, Aaron; de Bakker, Paul I. W.; Dörr, Marcus; Newton-Cheh, Christopher; Nolte, Ilja M.; van der Harst, Pim; Müller, Martina; Eijgelsheim, Mark; Alonso, Alvaro; Hicks, Andrew A. (December 2010). "Common variants in 22 loci are associated with QRS duration and cardiac ventricular conduction". Nature Genetics. 42 (12): 1068–1076. doi:10.1038/ng.716. ISSN 1546-1718. PMC 3338195. PMID 21076409.
  2. ^ a b Roselli, Carolina; Chaffin, Mark D.; Weng, Lu-Chen; Aeschbacher, Stefanie; Ahlberg, Gustav; Albert, Christine M.; Almgren, Peter; Alonso, Alvaro; Anderson, Christopher D.; Aragam, Krishna G.; Arking, Dan E. (September 2018). "Multi-ethnic genome-wide association study for atrial fibrillation". Nature Genetics. 50 (9): 1225–1233. doi:10.1038/s41588-018-0133-9. ISSN 1546-1718.
  3. ^ a b c d "C1orf185 chromosome 1 open reading frame 185 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
  4. ^ "Genome Data Viewer". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
  5. ^ "Home | Integrative Genomics Viewer". software.broadinstitute.org. Retrieved 2020-05-01.
  6. ^ "CDD Conserved Protein Domain Family: DUF4718". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
  7. ^ "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2020-05-01.
  8. ^ "ExPASy - Compute pI/Mw tool". web.expasy.org. Retrieved 2020-05-01.
  9. ^ Graveley, B. R.; Maniatis, T. (1998-04). "Arginine/serine-rich domains of SR proteins can function as activators of pre-mRNA splicing". Molecular Cell. 1 (5): 765–771. doi:10.1016/s1097-2765(00)80076-3. ISSN 1097-2765. PMID 9660960. {{cite journal}}: Check date values in: |date= (help)
  10. ^ "TMHMM Server, v. 2.0". www.cbs.dtu.dk. Retrieved 2020-05-01.
  11. ^ a b "CFSSP: Chou & Fasman Secondary Structure Prediction Server". www.biogem.org. Retrieved 2020-05-01.
  12. ^ a b "I-TASSER server for protein structure and function prediction". zhanglab.ccmb.med.umich.edu. Retrieved 2020-05-01.
  13. ^ "Genomatix - NGS Data Analysis & Personalized Medicine". www.genomatix.de. Retrieved 2020-05-01.
  14. ^ "13889230 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
  15. ^ "129780050 - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
  16. ^ a b "The Mfold Web Server | mfold.rit.albany.edu". unafold.rna.albany.edu. Retrieved 2020-05-01.
  17. ^ a b "TargetScanHuman 7.2". www.targetscan.org. Retrieved 2020-05-01.
  18. ^ Grimson, Andrew; Farh, Kyle Kai-How; Johnston, Wendy K.; Garrett-Engele, Philip; Lim, Lee P.; Bartel, David P. (2007-07-06). "MicroRNA Targeting Specificity in Mammals: Determinants Beyond Seed Pairing". Molecular cell. 27 (1): 91–105. doi:10.1016/j.molcel.2007.06.017. ISSN 1097-2765. PMC 3800283. PMID 17612493.
  19. ^ "NetPhos 3.1 Server". www.cbs.dtu.dk. Retrieved 2020-05-01.
  20. ^ "NetGlycate 1.0 Server". www.cbs.dtu.dk. Retrieved 2020-05-01.
  21. ^ "NetNGlyc 1.0 Server". www.cbs.dtu.dk. Retrieved 2020-05-01.
  22. ^ "YinOYang 1.2 Server". www.cbs.dtu.dk. Retrieved 2020-05-01.
  23. ^ "Korean Society for Exercise Nutrition". jenb.or.kr. doi:10.20463/jenb.2017.0027. PMC 5643203. PMID 29036767. Retrieved 2020-05-02.{{cite web}}: CS1 maint: PMC format (link)
  24. ^ "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
  25. ^ "TimeTree :: The Timescale of Life". www.timetree.org. Retrieved 2020-05-01.
  26. ^ "Clustal Omega < Multiple Sequence Alignment < EMBL-EBI". www.ebi.ac.uk. Retrieved 2020-05-01.