C5orf34
C5orf34 (chromosome 5 open reading frame 34) is a protein that in humans is encoded by the C5orf34 gene (5p12).[1][2]
C5orf34 is conserved in mammals, birds and reptiles with the most distant ancestor being the Burmese python, Python bivittatus. The C5orf34 protein contains two mammalian conserved domains: DUF 4520 and DUF 4524. The protein is also predicted to have a polo-box domain (PBD) of polo-like kinase 4 (plk4), which has predicted conservation in distant orthologs from the clade Aves.[3][4]
Gene
[edit]C5orf34 is located on the negative DNA strand of the short arm of chromosome 6 at locus 12. The gene is 28,744 base pairs long and spans from base pair 43,486,701 to base pair 43,515,445. The gene produces a single transcript of 2,540 base pairs long and encodes for 638 amino acids.[1][2][6]
Gene neighborhood
[edit]The gene PAIP1 is found on the negative strand just downstream of C5orf34 and is a member of the polyadenylate-binding family. PAIP1 extends from base pairs 43,526,267 to 43,557,419.[7] CCL28 is found downstream on the negative strand and extends from base pairs 43378052 to 43413837.[8]
Gene expression
[edit]There indication of multiple sources that suggest, in humans, C5orf34 protein is expressed non-ubiquitously in select tissues at low/moderate levels, with the most abundant expression in the tissues of the stomach, small intestine, testis, skeletal muscle and heart muscle.[9][10] A study of Rho kinase inhibitor effect on primary cell lines also showed that C5orf34 is expressed in dermal fibroblasts of normal human tissue samples.[11]
Promoter
[edit]The promoter region for C5orf34 is predicted to be between 43515079 and 43515773 and spans 695 base pairs.[12]
Protein
[edit]C5orf34 consists of 638 amino acids, has a weight of 72.7 kDa and an isoelectric point of 7.77 in humans.[1][13][14]
Function
[edit]Although the precise function of C5orf34 in humans remains unknown, there is evidentiary support based on structure that it is involved in kinase-related cellular functions.[15] In addition, C5orf34 is predicted to be nuclear, thus it has potential involvement in gene regulation and cell proliferation seeing as these are two primary signal transduction pathways involve nuclear kinase proteins.[16][17]
Structure
[edit]In humans, C5orf34 contains two domains of unknown function, DUF 4520 (pfam 15016) and DUF 4524 (pfam 150125), found between residues 6-153 and 444–539, respectively. The protein is serine and threonine rich. The charge distribution of the protein is equally dispersed per there are no positive or negative charge clusters sequestered within the protein.[13]
The predicted secondary structures of the human protein were assessed by multiple bioinformatic tools. All of the programs predicted the protein's structure to consist of alpha helices, extended strands, random coils and beta turns. The Phyre2 server provided a predicted human protein structure that indicated domains of plk polo-box of the serine/threonine-protein kinase plk4. The server predicted with 96.8% confidence of 20% coverage (130 residues) of the protein. The coverage exhibited residues of the conserved polo-box domain and the two DUF domains. The protein was predominantly soluble, with an average hydrophobicity of -0.478.[15][18][19]
Post-translational modifications
[edit]There is extensive, predicted phosphorylation of C5orf34, with 32 phosphoserines and 7 phosphothreonines being conserved in orthologs of the human C5orf34 protein. This analysis indicates C5orf34 as a phosphoprotein and supports structural predictions of it being a kinase protein. The protein contains only one nuclear export signal residue, found at 481-L; however the NES score was found to be low at 0.515. Structural analysis of the protein indicated it was sequestered in the nucleus with an 87% probability.[17][20][21]
Interacting proteins
[edit]Databases of protein interactions (MINT, STRING, IntAct, and BioGRID) have not identified any interactions with C5orf34.
Homology and evolution
[edit]C5orf34 is highly conserved in primates and mammals and moderately conserved in reptiles. The furthest conserved ortholog is in Python bivittatus, or the Burmese python. Below is a selected list of orthologs to demonstrate the homology of this gene with relation to the reference sequence in Homo sapiens.
Orthologous space
[edit]151 organisms have been predicted orthologs with C5orf34.[2] The most distant ortholog is the Burmese python, which diverged from humans 296 million years ago, indicating C5orf34 developed in reptiles and birds.[3][22]
Table of C5orf34 orthologs
[edit]Scientific Name | Common
Name |
Date of Divergence from Humans (MYA)[23] | NCBI Protein Accession # | Protein Length (amino acids) | Sequence Similarity (%) |
---|---|---|---|---|---|
Homo sapiens | Human | 0 | NP_001076895.1 | 638 | 100 |
Gorilla gorilla | Gorilla | 8.8 | XP_004058945.1 | 636 | 92 |
Camelus ferus | Bactrian Camel | 97.4 | XP_006191979.1 | 640 | 84 |
Panthera tigris altaica | Siberian Tiger | 97.4 | XP_007095478.1 | 638 | 83 |
Sus scrofa | Wild Boar | 97.4 | XP_003133971.3 | 441 | 80 |
Bos Tarus | Cattle | 97.4 | NP_001076895.1 | 638 | 80 |
Erinaceus europaeus | European Hedgehog | 97.4 | XP_007517686.1 | 632 | 69 |
Mus Musculus | House Mouse | 91 | BAE28742.1 | 382 | 75 |
Monodelphis domestica | Gray Short-tailed Opossum | 176.1 | XP_007487459.1 | 512 | 62 |
Chelonia mydas | Green Turtle | 324.5 | XP_007052886.1 | 638 | 51 |
Aptendodytes forsteri | Emperor Penguin | 324.5 | XP_009272830.1 | 647 | 48 |
Gallus gallus | Chicken | 324.5 | XP_424782.3 | 669 | 48 |
Python bivittatus | Burmese python | 324.5 | XP_007430528.1 | 649 | 46 |
Paralogous space
[edit]There are no predicted paralogs for C5orf34 in both humans and mice.[3]
Conserved regions
[edit]Multiple sequence alignments indicated amino acid residue conservation throughout the C5orf34 protein in an array of orthologs, with the most highly conserved regions at both N-terminus and C-terminus where the DUF are located. DUF 4520 (pfam 15016) was found to be conserved in C-terminus and DUF 4524 (pfam 150125) was found to be conserved in the N-terminus. Also, the polo-box domain of plk4 was found to be conserved in the C-terminus in a multiple sequence alignment in both strict and distant orthologs.[22]
References
[edit]- ^ a b c "NCBI Protein". www.ncbi.nlm.nih.gov. Retrieved 2015-05-09.
- ^ a b c "NCBI Gene". www.ncbi.nlm.nih.gov. Retrieved 2015-05-09.
- ^ a b c d "NCBI Blast". www.ncbi.nlm.nih.gov. Retrieved 2015-05-09.
- ^ Sillibourne, James E.; Bornens, Michel (2010-09-29). "Polo-like kinase 4: the odd one out of the family". Cell Division. 5 (1): 25. doi:10.1186/1747-1028-5-25. ISSN 1747-1028. PMC 2955731. PMID 20920249.
- ^ a b Castro, Edouard. "PROSITE". prosite.expasy.org. Retrieved 2015-05-10.
- ^ "Ensembl Genome Browser". www.ensembl.org. Retrieved 2015-05-09.
- ^ "NCBI Gene". www.ncbi.nlm.nih.gov. Retrieved 2015-05-09.
- ^ "NCBI Gene". www.ncbi.nlm.nih.gov. Retrieved 2015-05-09.
- ^ "Tissue expression of C5orf34 - Summary - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2015-05-09.
- ^ "NCBI GeoProfile". www.ncbi.nlm.nih.gov. Retrieved 2015-05-09.
- ^ Boerma, Marjan; Fu, Qiang; Wang, Junru; Loose, David S.; Bartolozzi, Alessandra; Ellis, James L.; McGonigle, Sharon; Paradise, Elsa; Sweetnam, Paul; Fink, Louis M.; Vozenin-Brotons, Marie-Catherine; Hauer-Jensen, Martin (2008). "Comparative gene expression profiling in three primary human cell lines after treatment with a novel inhibitor of Rho kinase or atorvastatin". Blood Coagulation & Fibrinolysis. 19 (7): 709–718. doi:10.1097/MBC.0b013e32830b2891. PMC 2713681. PMID 18832915.
- ^ "Genomatix: Annotation & Analysis". www.genomatix.de. Retrieved 2015-05-09.
- ^ a b "Statistical Analysis of PS (SAPS)". Biology Workbench. Subramaniam, Shankar. Retrieved 5 May 2015.[permanent dead link]
- ^ "ExPASy - Compute pI/Mw tool". web.expasy.org. Retrieved 2015-05-09.
- ^ a b "Phyre Investigator output for C5orf34__ with c1umwB_". www.sbg.bio.ic.ac.uk. Retrieved 2015-05-09.[permanent dead link]
- ^ Matthews, Harry R.; Huebner, Verena D. (1984-03-01). "Nuclear protein kinases". Molecular and Cellular Biochemistry. 59 (1–2): 81–99. doi:10.1007/BF00231306. ISSN 0300-8177. PMID 6323962. S2CID 25765323.
- ^ a b "PSORT II server". www.genscript.com. Archived from the original on 2021-07-09. Retrieved 2015-05-09.
- ^ UCBL, Institut. "NPS@ : SOPMA secondary structure prediction". npsa-prabi.ibcp.fr. Retrieved 2015-05-09.
- ^ Sobhani, Armin. "PELE - Protein Energy Landscape Exploration - Web Server". pele.bsc.es. Retrieved 2015-05-09.
- ^ "NetPhos 2.0 Server". www.cbs.dtu.dk. Retrieved 2015-05-09.
- ^ "NetNES 1.1 Server". www.cbs.dtu.dk. Retrieved 2015-05-09.
- ^ a b "CLUSTALW". SDSC. Subramaniam, Shankar. 5 May 2015.[permanent dead link]
- ^ "TimeTree :: The Timescale of Life". www.timetree.org. Retrieved 2015-05-10.