C8orf58

From Wikipedia, the free encyclopedia
C8orf58
Identifiers
AliasesC8orf58, chromosome 8 open reading frame 58
External IDsMGI: 2145726 HomoloGene: 19540 GeneCards: C8orf58
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_173686
NM_001013842
NM_001198827

NM_001004155
NM_001112735

RefSeq (protein)

NP_001013864
NP_001185756
NP_775957

NP_001004155
NP_001106206

Location (UCSC)Chr 8: 22.6 – 22.6 MbChr 14: 70.39 – 70.4 Mb
PubMed search[3][4]
Wikidata
View/Edit HumanView/Edit Mouse

Chromosome 8 open reading frame 58 is an uncharacterised protein that in humans is encoded by the C8orf58 gene.[5] The protein is predicted to be localized in the nucleus.

Gene[edit]

The C8orf58 gene is located on chromosome 8 at position 8p21.3. It spans a total of 4,550 base pairs and has seven exons. C8orf58 is flanked by the genes PDLIM2 and CCAR2.[6] There are no aliases. It is defined as a protein coding gene.[7]

mRNA[edit]

C8orf58 produces three transcript splice variants. The transcript of variant 1 represents the longest transcript and encodes the largest protein. It is 2,062 base pairs and contains seven exons. There are two other splice variants, produced by alternative splice sites.[8]

Isoform Exons Length (base pairs) Features
Transcript Variant 1 1, 2, 3, 4, 5, 6, 7 2062 One upstream in-frame stop codon.
Transcript Variant 2 1, 2, 3, 4, 5, 6, 7 2038 Alternate in-frame splice site in the 3' coding region.
Transcript Variant 3 1, 2, 3, 4, 5, 6 1955 Lacks an alternate exon, results in a frameshift in the 3' coding region.

C8orf58 has a relatively short 5’ region and a moderate 3’ region. Both the 5’ and 3’ regions contain stem loops.[9] There is one predicted miRNA binding site that found in the 3’UTR of C8orf58.[10]

Protein[edit]

C8orf58 protein Isoform 1 is 365 amino acids long. Isoform 2 and Isoform 3 are 357 and 300 amino acids respectively. There is a kozak consensus sequence present, which confirms it is a protein coding sequence.[11]

C8orf58 Isoform 1 has a molecular weight of 39.7 kDa and an isoelectric point of 8.29. It is proline and arginine rich and isoleucine, asparagine, phenylalanine, and tyrosine poor.[12]

The predicted secondary structure of the C8orf58 protein include multiple alpha helices and one beta strands.[12][13]

Isoform From mRNA Variant Length (amino acids) Molecular Weight (kDa) Isoelectric Point
1 1 365 39.7 8.30
2 2 357 38.6 8.30
3 3 300 32.0 5.82

Evolutionary history[edit]

It is part of the DUF4657 family, a family of proteins found in eukaryotes. Proteins in this family are typically between 305 and 370 amino acids in length.[14] The Domain of Unknown Function (DUF) of C8orf58 is located between amino acids 73 to 364.

Expression[edit]

According to the NCBI GEO profiles, C8orf58 is a narrowly expressed protein found in spleen, lung, thymus, prostate, and spinal cord tissue. It is constitutively expressed in these tissues.[15]

Post-translational modification[edit]

The bioinformatic tools on Expasy were used to determine potential post translational modification sites for the C8orf58 protein. There are two predicted phosphorylation sites and one predicted sumoylation site.[16]

Subcellular localization[edit]

According to PSORT II, C8orf58 is located in the nucleus. This is supported by the presence of a sumoylation site, which is involved in nucleic cytoplasmic transport.

Interacting proteins[edit]

Two proteins have been found to interact with protein C8orf58, CENPH and metG1, which were found using two hybrid assay and the two hybrid pooling approach respectively.[17] CENPH (Centromere Protein H) plays a critical role in centromere structure, kinetochore formation, and sister chromatid separation.[18] MetG1 (Methionine—tRNA ligase) is required for elongation of protein synthesis and the initiation of all mRNA translation through initiator tRNA(fMet) aminoacylation.[19]

Homology[edit]

An important paralog of this gene is ENSG00000248235.[20] Orthologs of the human gene C8orf58 are limited to vertebrates of the animal kingdom.

Scientific Name Common Name NCBI Accession Number Length (Amino Acids) Date of Divergence (MYA) Identity (%) Similarity (%)
Homo sapiens Human NP_001013864.1 365 - - -
Gorilla gorilla Gorilla XP_004046807.1 439 9.06 96 79.50
Marmota marmota Alpine Marmot XP_015354979.1 369 90 68 75.7
Oryctolagus cuniculus European Rabbit XP_008248092.1 371 90 66 72
Nannospalax galili Spalax XP_008848689.1 362 90 65 74.7
Ceratotherium simum simum White Rhinoceros XP_014652157.1 381 96 66 72.7
Odobenus rosmarus divergens Pacific walrus XP_012418498.1 388 96 65 74.7
Sus scrofa Wild Boar XP_005670472.1 382 96 65 73.3
Hipposideros armiger Great Roundleaf Bat XP_019487131.1 387 96 62 71
Eptesicus fuscus Big Brown Bat XP_008149784.1 377 96 62 70.1
Loxodonta africana African Bush Elephant XP_003412428.1 372 105 71 77.2
Orycteropus afer afer Aardvark XP_007949039.1 370 105 65 71.7
Parus major Great Tit XP_015504136.1 320 312 32 35.6
Anolis carolinensis Carolina Anole XP_008118367.1 453 312 28 38.9

References[edit]

  1. ^ a b c GRCh38: Ensembl release 89: ENSG00000241852Ensembl, May 2017
  2. ^ a b c GRCm38: Ensembl release 89: ENSMUSG00000044551Ensembl, May 2017
  3. ^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. ^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. ^ "Entrez Gene: Chromosome 8 open reading frame 58". Retrieved 2017-11-22.
  6. ^ NCBI Nucleotide. Homo sapiens chromosome 8 open reading frame 58 (C8orf58), transcript variant 1, mRNA. [1]
  7. ^ GeneCard. C8orf58 Gene(Protein Coding) Chromosome 8 Open Reading Frame 58. [2]
  8. ^ NCBI Gene. C8orf58 chromosome 8 open reading frame 58 [Homo sapiens (human)]. [3]
  9. ^ RNA Folding Form
  10. ^ TargetScan Human
  11. ^ NCBI Protein. Uncharacterized protein C8orf58 isoform 1 [Homo sapiens].[4]
  12. ^ a b SDSC Biology Workbench
  13. ^ Chou-Fasman Secondary Structure Prediction Server
  14. ^ UniProtKB - Q8NAV2 (CH058_HUMAN). UniProt
  15. ^ NCBI GEO Profiles
  16. ^ Expasy Bioinformatics Resource Portal
  17. ^ IntAct Molecular Interaction Database
  18. ^ Centromere protein H
  19. ^ Methionine--tRNA ligase
  20. ^ GeneCard. 8orf58 Gene(Protein Coding) Chromosome 8 Open Reading Frame 58. [5].