Jump to content

C5orf46

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Aseleste (talk | contribs) at 02:12, 26 January 2021 (Importing Wikidata short description: "Protein-coding gene in the species Homo sapiens" (Shortdesc helper)). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

C5orf46
Identifiers
AliasesC5orf46, SSSP1, chromosome 5 open reading frame 46, AP-64
External IDsMGI: 2684940; HomoloGene: 19192; GeneCards: C5orf46; OMA:C5orf46 - orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_206966

NM_001033280

RefSeq (protein)

NP_996849

NP_001028452

Location (UCSC)Chr 5: 147.88 – 147.91 MbChr 18: 43.91 – 43.93 Mb
PubMed search[3][4]
Wikidata
View/Edit HumanView/Edit Mouse

C5orf46 is a protein coding gene located on chromosome 5 in humans. It is also known as sssp1, or skin and saliva secreted protein 1. There are two known isoforms known in humans, with isoform 2 (analyzed throughout this page) being the longer of the two. The protein encoded is predicted to have one transmembrane domain, and has a predicted molecular weight of 9,692 Da, and a basal isoelectric point of 4.67.[5]

Gene

Found on the minus strand of chromosome 5, the c5orf46 isoform X2 is 4679 nucleotides in length and has 4 exons.

Evolution and orthologs

C5orf46 orthologs are only found in Chordata, with the earliest instance being found in the Ornithorhynchus anatinus around 177 million years ago.[6] Highly conserved regions include the signal peptide sequence found towards the N-terminus of the protein. There are no paralogs found in humans.

Table 1: C5orf46 Orthologs
Genus/Species Organism Common Name Order Accession Number Length (amino acids) Sequence Identity with Human Sequence Similarity with Human
Homo sapiens Human Primates XP_005268503.2 102 100 100
Pan paniscus Bonobo Primates XP_003829110.1 87 98 98
Octodon degus Common degu Rodentia XP_004631400.1 73 68 82
Mus musculus House mouse Rodentia NP_001028452.1 93 68 77
Urocitellus parryii Arctic ground squirrel Rodentia XP_026240295.1 92 59 72
Leptonychotes weddelli Weddell seal Carnivora XP_006732700.1 124 78 94
Acinonyx jubatus Cheetah Carnivora XP_026897868.1 147 76 87
Zalophus califronianus California sea lion Carnivora XP_027462230.1 84 75 86
Sorex araneus Common shrew Eulipotyphla XP_004618219.1 73 74 79
Vicugna pacos Alpaca Artiodactyla XP_006204536.1 88 73 81
Delphinapertus leucas Beluga whale Artiodactyla XP_030618631.1 88 73 81
Camelus bactrianus Bactrian camel Artiodactyla XP_010954370.1 83 73 79
Orcinus orca Killer Whale Artiodactyla XP_004280428.1 88 71 83
Sus scrofa Wild boar Artiodactyla XP_003354397.2 90 67 82
Manis javanica Seunda pangolin Pholidota XP_017496222.1 155 62 79
Myotis davidii Bat Chiroptera XP_015428095.1 104 41 50
Elephantulus edwardii Cape elephant shrew Macroscelidea XP_006893786.1 76 78 80
Dasypus novemcinctus Nine-banded armadillo Cingulata XP_004447160.1 88 72 82
Vombatus urinus Common wombat Diprotodontia XP_027697780.1 78 45 62
Phascolarctos cinereus Koala Diprotodontia XP_020854530.1 78 44 62
Ornithorhynchus anatinus Platypus Monotremata XP_028912384.1 80 43 61

Promoters

A Genomatix ElDorado promoter database search predicted one promoter for c5orf46. This promoter has the ID number of GXP_123762 and transcript ID number GXT_22785522. The promoter is located on the minus strand of chromosome 5, and was predicted to range from nucleotides 147906451 to 147908007, making it 1557 nucleotides in length.

Transcription factors

A total of 428 transcription factor binding sites were predicted to be located within the predicted promoter sequence. The predictions included the following transcription factors:[7]

  • Sine oculis homeodomain factors (SIXF)
  • p53 tumor suppressor (P53F)
  • Vertebrate homologues of enhancer of split complex (HESF)
  • Histone nuclear factor P (HNFP)
  • NKX homeodomain factors (NKXH)
  • C/EBP homologous protein (CHOP)
  • X-box binding factors (XBBF)
  • TALE homeodomain class recognizing TG motifs (TALE)
  • Human and murine ETS1 factors (ETSF)
  • SWI/SNF related nucleophosphoproteins with a RING finger DNA binding motif (RUSH)
  • Fork head domain factors (FKHD)
  • RNA Polymerase II transcription factor II B (TF2B)
  • TGF-beta induced apoptosis proteins (TAIP)
  • GATA binding factors (GATA)
  • Ccaat/Enhancer binding protein (CEBP)
  • cAMP-responsive element binding proteins (CREB)
  • Vertebrate TATA binding protein factor (VTBP)

Expression

C5orf46 is largely expressed in salivary glands and skin tissue, though some expression in heart tissue, testis, and placenta is also observed.[8]

C5orf46 expression microarray data comparisons between healthy and psoriasis patients.[9]

Microarray data measuring c5orf46 expression in psoriasis patients revealed a trend of low expression in patients with lesional psoriasis. Samples from lesional psoriasis patients had significantly lower c5orf46 expression compared to non-lesional psoriasis patients and healthy control samples.[9]

Protein

Primary structure

C5orf46 is 102 amino acids in length. The protein has a signal peptide sequence at its N-terminus. The signal peptide sequence is highly conserved in orthologs. The amino acid sequence includes a DDKPD sequence that is repeated, with an aspartate and lysine rich region.

Secondary structure

Through prediction software including the Chou and Fasman Secondary Structure Prediction server and Prabi GOR IV Prediction analysis, two alpha-helical segments were predicted.[10][11]

Tertiary structure

Predictive models made by Phyre2 and SWISS-Model have shown two alpha-helical domains with a bend between them.[12][13]

Protein regulation

C5orf46 has multiple predicted post-translation modification sites, and one modification identified through mass spectrometry. Mass spectrometry analysis of extracts from a NCI-H2228 lung cancer cell line have identified an acetylation site at K42.[14] C5orf46 has predicted phosphorylation sites at T14, S52, S84, and S86.[15] Predicted sumoylation sites are present at K41, K44, K48, K54, and K57.[16] There are two predicted O-GlcNAcylation sites found at S100 and S101.[17]

Localization

An analysis of the c5orf46 amino acid sequence revealed that the protein is likely to be secreted.[18] Further sequence analyses have predicted that the protein has one transmembrane domain, with an intracellular N-terminal domain.[19][20]

Interactions

C5orf46 has been predicted to interact with phosphopantothenoylcysteine synthetase (PPCS) and transmembrane BAX inhibitor motif containing 6 (TMBIM6) through affinity purification-mass spectrometry methods.[21]

Clinical significance

C5orf46 has been shown to be a prognostic marker in renal and cervical cancer, with high expression being linked to unfavorable outcomes. These conclusions were based on Human Protein Pathology Atlas gene expression analyses and survival outcomes of 651 and 291 patients with renal and cervical cancer respectively.[22] In these analyses, patients that were classified with high expression of c5orf46 were shown to have a 50% lower survival rate after 10 years than patients with low expression.

References

  1. ^ a b c GRCh38: Ensembl release 89: ENSG00000178776Ensembl, May 2017
  2. ^ a b c GRCm38: Ensembl release 89: ENSMUSG00000071858Ensembl, May 2017
  3. ^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. ^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. ^ "C5orf46 (human)". www.phosphosite.org. Retrieved 2020-03-02.
  6. ^ "Protein BLAST: search protein databases using a protein query". blast.ncbi.nlm.nih.gov. Retrieved 2020-03-02.
  7. ^ "Genomatix - NGS Data Analysis & Personalized Medicine". www.genomatix.de. Retrieved 2020-05-03.
  8. ^ "C5orf46 chromosome 5 open reading frame 46 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-03.
  9. ^ a b "GDS4602 / 1554195_a_at". www.ncbi.nlm.nih.gov. Retrieved 2020-05-03.
  10. ^ "CFSSP: Chou & Fasman Secondary Structure Prediction Server". www.biogem.org. Retrieved 2020-05-03.
  11. ^ "NPS@ : GOR4 secondary structure prediction". npsa-prabi.ibcp.fr. Retrieved 2020-05-03.
  12. ^ "PHYRE Protein Fold Recognition Server". www.sbg.bio.ic.ac.uk. Retrieved 2020-05-03.
  13. ^ "SWISS-MODEL". swissmodel.expasy.org. Retrieved 2020-05-03.
  14. ^ "Lys42". www.phosphosite.org. Retrieved 2020-05-03.
  15. ^ "NetPhos 3.1 Server". www.cbs.dtu.dk. Retrieved 2020-05-03.
  16. ^ "SUMOplot™ Analysis Program | Abcepta". www.abcepta.com. Retrieved 2020-05-03.
  17. ^ "YinOYang 1.2 Server". www.cbs.dtu.dk. Retrieved 2020-05-03.
  18. ^ "Welcome to psort.org!!". www.psort.org. Retrieved 2020-05-03.
  19. ^ "TMHMM Server, v. 2.0". www.cbs.dtu.dk. Retrieved 2020-05-03.
  20. ^ "長浜バイオ大学 学内アクセス". ripple.nagahama-i-bio.ac.jp. Retrieved 2020-05-03.
  21. ^ "C5orf46 (UNQ472/PRO839) Result Summary | BioGRID". thebiogrid.org. Retrieved 2020-05-03.
  22. ^ "Expression of C5orf46 in cancer - Summary - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2020-03-02.