Jump to content

C12orf50

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Heavy Grasshopper (talk | contribs) at 11:59, 1 August 2023 (Changing short description from "Protein encoding gene C12orf50" to "Protein-coding gene in humans"). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

C12orf50
Identifiers
AliasesC12orf50, chromosome 12 open reading frame 50
External IDsMGI: 1913855; HomoloGene: 45135; GeneCards: C12orf50; OMA:C12orf50 - orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_152589
NM_001363616

NM_001081246

RefSeq (protein)

NP_689802
NP_001350545

n/a

Location (UCSC)Chr 12: 87.98 – 88.03 MbChr 10: 100.43 – 100.45 Mb
PubMed search[3][4]
Wikidata
View/Edit HumanView/Edit Mouse

Chromosome 12 Open Reading Frame 50 (C12orf50) is a protein-encoding gene which in humans encodes for the C12orf50 protein. The accession id for this gene is NM_152589. The location of C12orf50 is 12q21.32. It covers 55.42 kb, from 88429231 to 88373811 (NCBI 37, August 2010), on the reverse strand.[5] Some of the neighboring genes to C12orf50 are RPS4XP15, LOC107984542, and C12orf29.[6] RPS4XP15 is upstream C12orf50 and is on the same strand. LOC107984542 and C12orf29 are both downstream. LOC107984542 is on the opposite strand while C12orf29 is on the same strand. C12orf50 has six isoforms. This page is focusing on isoform X1. C12orf50 isoform X1 is 1711 nucleotides long and has a protein with a length of 414 aa.

C12orf50 genomic location with neighboring genes

Function

The ontology points to the function of C12orf50 is to enable mRNA and protein binding.[6] It also is involved in poly(A)+ mRNA export from the nucleus.

Isoforms

The C12orf50 gene has 6 isoforms.

Isoform NCBI Accession mRNA length (nt) Protein length (aa) Features
X1 NM_152589.3 1711 414 Longest protein with a zinc finger
X2 NM_001363616.2 1669 375
X3 XM_017018887.1 1940 468 Longest protein without a zinc finger
X4 XM_017018888.2 1550 375
X5 XM_011537985.1 3879 395 Longest mRNA
X6 XM_024448868.1 1577 374

Gene expression

HPA RNA-seq on normal tissues to determine tissue-specificity of human C12orf50 gene

In an analysis of human tissues with specific expression by the genome, RNA-seq was performed on tissue samples from 95 human individuals representing 27 different tissues in order to determine tissue-specificity protein-coding genes found that the expression of C12orf50 is very low in most human tissues with the exception of the testis.[7] C12orf50’s expression was restricted towards testis.[8]

Protein

Uncharacterized protein Chromosome 12 Open Reading Frame 50 is a protein in humans, encoded by the C12orf50 gene. The protein accession id is Q8NA57. The protein has a length of 414aa. The predicted mass of the protein is 47.2 kDa.[9] The protein includes a CCCH-type Zn Finger Domain.[10] The protein has a CCCH-type Zn Finger Domain with a C-X8-C-X5-C-X3-H motif. The domain starts at the beginning of the protein and goes to the 44th amino acid. The protein also has three disordered regions from the 136th amino acid to 168th with of length of 33 aa, 297th to 333rd with a length of 37 aa, and 346th to 414th with a length of 69 aa.[11] The predicted molecular weight is 47.3 kDa and the predicted isoelectric point is 8.79.

Structure

The predicted tertiary structure for C12orf50 has two beta-sheets towards the beginning of the protein in the zinc finger domain and a helix from 106-124aa.[12] These are conserved throughout mammalian orthologs. There is also a large number of coiled regions. The promoter, 3’ UTR region, and 5’ UTR are very well conserved. There is a negative cluster (acidic domain) before and at the beginning of the helix from amino acid 87 to 111.[13]

Localization

There is a 47.8% probability of being in the nucleus and a 30.4% probability of being in the cytoplasm.[14] This was confirmed by immunohistochemistry and immunofluorescence by Sigma-Aldrich showing positivity in both the nucleus and cytoplasm.[15] There is a nuclear location signal and acidic domain. The orthologs also confirm that C12orf50 is localized in the nucleus and cytoplasm.

Protein Interactions

There are two proteins (GAPDHS and GOLGA2) that interact with C12orf50. Glyceraldehyde-3-phosphate dehydrogenase, spermatogenic (GAPDHS) enzyme may play an important role in regulating the switch between different energy-producing pathways, and it is required for sperm motility and male fertility.[16]

Post-translation Modifications

Cartoon of the C12orf50 protein with post-translation modifications

C12orf50 has been predicted to undergo various phosphorylation, c-mannosylation, and O-glycosylations. The phosphorylation sites are at amino acids 262, 349 and 370. The O-glycosylation sites are amino acids 139, 238, and 374. The c-mannosylation sites are amino acids 13, 102, 292, and 388.

Evolution

C12orf50 has an evolutionary rate that is close to Fibrinogen alpha, making it relatively quick. Orthologs for C12orf50 have been found in mammals, reptiles, birds, and amphibians caecilians. No orthologs were found for frogs, fish, invertebrates, or fungi. The mammalian orthologs shared the most similarity with humans with the exception of the platypus. The range of divergence from humans from mammals was 6.4-180 million years. The reptilian orthologs were the next similar and diverged around 318 million years ago. Then the birds diverged from humans at the same time as the reptiles. The least similar was the amphibian caecilians and they diverged around 351.7 million years ago.

Homology

A unrooted phylogenetic tree of C12orf50 showing evolutionary descent from a common ancestor. Mammals have shortest lines since those orthologs have the closest divergence from humans. Then the reptiles and birds have medium length lines to indicate their divergence from humans. Then caecilians with the longest line to show that they have been diverged the longest.

Orthologs

C12orf50 has orthologs in mammals, aves, reptiles and caecilian amphibians. No orthologs were found in amphibian frogs, invertebrates, plants, fungi, or yeast. The table below shows some of the orthologs that can be found on BLAST.[17]

Species Organism common name NCBI Accession Sequence Identity Sequence Similarity Length(AAs)
Homo sapiens Human NP_689802.1 100% 100% 414
Pan paniscus Bonobo XP_003828373.1 98.8% 99.3% 414
Phoca vitulina Harbor seal XP_032272023.1 87.7% 93.0% 415
Lipotes vexillifer Daiji dolphin XP_007449184.1 86.3% 92.3% 415
Gavialis gangeticus Gharial XP_019370370.1 47.1% 63.5% 378
Gopherus gangeticus Tortoises XP_030404453.1 47.1% 61.3% 414
Alligator sinensis Chinese alligator XXP_006027976.1 44.2% 60.1% 363
Gallus gallus Chicken XP_040518234.1 38.5% 52.3% 405
Coturnix japonica Japanese quail XP_032299702.1 36.8% 50.9% 403
Geotrypetes seraphini Gaboon caecilian XP_033807710.1 39.7% 55.3% 409

Paralogs

C12orf50 has two paralogs: ZC3H11A and ZC3H11B. The zinc the finger domain is considered in both of the paralogs.

Multiple sequence alignment of human protein C12orf50 and its paralogs: ZC3H11A and ZC3H11B
Gene NCBI Accession Sequence Similarity Length(AAs)
C12orf50 NP_689802.1 100% 414
ZC3H11A NP_001306167.1 16.2% 810
ZC3H11B NNP_001342386.1  15.9% 805

References

  1. ^ a b c GRCh38: Ensembl release 89: ENSG00000165805Ensembl, May 2017
  2. ^ a b c GRCm38: Ensembl release 89: ENSMUSG00000056912Ensembl, May 2017
  3. ^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. ^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. ^ "AceView: Gene:C12orf50, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". www.ncbi.nlm.nih.gov.
  6. ^ a b "C12orf50 chromosome 12 open reading frame 50 [Homo sapiens (human)". www.ncbi.nlm.nih.gov. National Center for Biotechnology Information.
  7. ^ Fagerberg, Linn; Hallström, Björn M.; Oksvold, Per; Kampf, Caroline; Djureinovic, Dijana; Odeberg, Jacob; Habuka, Masato; Tahmasebpoor, Simin; Danielsson, Angelika; Edlund, Karolina; Asplund, Anna; Sjöstedt, Evelina; Lundberg, Emma; Szigyarto, Cristina Al-Khalili; Skogs, Marie; Takanen, Jenny Ottosson; Berling, Holger; Tegel, Hanna; Mulder, Jan; Nilsson, Peter; Schwenk, Jochen M.; Lindskog, Cecilia; Danielsson, Frida; Mardinoglu, Adil; Sivertsson, Åsa; von Feilitzen, Kalle; Forsberg, Mattias; Zwahlen, Martin; Olsson, IngMarie; Navani, Sanjay; Huss, Mikael; Nielsen, Jens; Ponten, Fredrik; Uhlén, Mathias (February 2014). "Analysis of the Human Tissue-specific Expression by Genome-wide Integration of Transcriptomics and Antibody-based Proteomics". Molecular & Cellular Proteomics. 13 (2): 397–406. doi:10.1074/mcp.M113.035600. PMC 3916642. PMID 24309898.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  8. ^ "AceView: Gene:C12orf50, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". www.ncbi.nlm.nih.gov.
  9. ^ "RecName: Full=Uncharacterized Protein C12orf50". www.ncbi.nlm.nih.gov. National Center for Biotechnology Information.
  10. ^ "C12orf50 chromosome 12 open reading frame 50 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov.
  11. ^ "C12orf50 - Uncharacterized protein C12orf50 - Homo sapiens (Human) - C12orf50 gene & protein". www.uniprot.org.
  12. ^ "AlphaFold Protein Structure Database". alphafold.ebi.ac.uk.
  13. ^ "SAPS Results". www.ebi.ac.uk.
  14. ^ https://psort.hgc.jp/cgi-bin/runpsort.pl. {{cite web}}: Missing or empty |title= (help)[permanent dead link]
  15. ^ "C12orf50 antibody".
  16. ^ "GAPDHS glyceraldehyde-3-phosphate dehydrogenase, spermatogenic [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov.
  17. ^ "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov.