= Chromosome 5 open reading frame 15 =

Chromosome 5 open reading frame 15 (C5orf15) is a protein which in humans is encoded by the C5orf15 gene. It can also be referred to as keratinocyte-associated transmembrane protein 2 precursor (KCT2) or HTGN29, but is most commonly referred to as C5orf15. The C5orf15 protein is a Single-pass membrane protein.

== Gene ==
In the human genome, C5orf15 is located on chromosome 5, specifically at 5q31.1 on the negative strand. C5orf15 spans 2,188 base pairs, which includes three exons. The C5orf15 gene has one protein-coding sequence.

== mRNA ==
C5orf15 mRNA has a single sequence variant that includes 13,116 nucleotides. The 5' untranslated region (UTR) is short compared to the coding sequence and much shorter than the 3' untranslated region. The 5' UTR contains 44 nucleotides, while the 3' UTR contains 1,352 nucleotides.

=== Conceptual translation ===
The shown conceptual translation includes the coding sequence and protein sequence for C5orf15. It is annotated with key regions, post-translational modifications, conserved amino acids, and variants.

== Protein ==
C5orf15 protein (also referred to as keratinocyte-associated transmembrane protein 2 precursor) is composed of 265 amino acids, which, in humans, has one isoform (Ref: NP_064584.1). The function of the C5orf15 protein is not yet well understood in the scientific community. The C5orf15 protein contains regions of phosphorylation, glycosylation, a transmembrane region, and two disordered regions.

=== Composition ===
The theoretical isoelectric point was found to be 4.93. The compositional analysis showed a lower concentration of Glycine (G) in C5orf15 than typical human proteins. A high-scoring transmembrane region was identified from 199-215, with a score of 54.0. A hydrophobic region was identified along with the transmembrane segments, with a score of 26.0. The C5orf15 protein has a molecular weight of approximately 29 kDa.

=== Structure ===
The structure of C5orf15 is composed of many random coiled regions and some alpha helices. The Alpha Fold structure did not separate extracellular and cytoplasmic regions of the protein, while I-Tasser predictions did.

=== Post-Translational Modifications ===
C5orf15 undergoes multiple post-translational modifications, which include: phosphorylations and glycosylations. Predicted post-translational modifications from Expasy include: 16 O-linked glycosylations, 2 tyrosine sulfations, a signal peptide cleavage site at 36-37, and 15 functional motifs.

| Type | Possible C5orf15 Significance |
| O-linked glycosylation | Suggests highly glycosylated regions may be extracellular and could be important for protein interactions. |
| N-linked glycosylation | Possibly assists in the folding of the N-terminus region and allows for secretion. |
| Signal peptide cleavage site | Likely to be secreted and end up in the extracellular region or outside another organelle. |
| O-GalNAc (mucin type) glycosylation | Dense cluster from AA 53–109 possibly indicates importance in cell-signaling or a structural role. |
| Tyrosine Sulfation | May function in signaling or adhesion with two sulfated residues. |

=== Protein interactions ===
C5orf15 has multiple protein interactions with human and viral proteins. These are highlighted below:

==== Human proteins ====
Source:
- EFCAB8 (EF-hand calcium-binding domain-containing protein 8)
- ZCCHC9 (Zinc finger CCHC domain-containing protein 9)
- FXYD5 (FXYD domain-containing ion transport regulator 5)
- TXNDC12 (Thioredoxin domain-containing protein 12)
- TMEM123 (Porimin)
- ZNF17 (Zinc finger protein 17)
- PHYKPL (5-phosphohydroxy-L-lysinephospho-lyase)
- TMEM159 (Promethin)
- KCNE3 (Potassium Voltage-Gated Channel Subfamily E Regulatory Subunit 3)
- KRAS (V-Ki-Ras2 Kirsten Rat SarcomaViral Oncogene Homolog)

==== Viral proteins ====
Source:
- E5 Human Papillomavirus type 18
- Ap3a_sars2 SARS-CoV-2
- LMP2 Epstein-Barr virus (strain B95-8)
- NS3 Human coronavirus NL63

== Expression and localization: ==

=== Tissue expression ===
The C5orf15 gene is ubiquitously expressed at varying high levels in human tissue, with a 4 fold difference between the lowest and highest RPKM expression values. Expression has been classified under Cluster 1 Liver and Kidney-Metabolism. Highest expression was observed in kidney, urinary bladder, adrenal, and placenta.

=== Localization ===
C5orf15 protein is predicted to be localized to the Golgi apparatus. Fluorescent antibody staining of C5orf15 showed clear localization to the golgi apparatus. C5orf15 can be used as a possible antibody stain for the Golgi apparatus. The Golgi apparatus is an important component of the cell that receives signals from the endoplasmic reticulum and modifies proteins for further secretion, acting as a central processing region within the cell.

== Homology ==

=== Paralogs ===
The C5orf15 Protein has one identified paralog: the trans-golgi network integral membrane protein 2 (TGOLN2), which has six isoforms that appeared when the C5orf15 Protein sequence was sequenced against to Homo sapiens using BLAST technology. TGOLN2 is involved in regulating membrane traffic to and from the trans-Golgi network.
  - Paralogs**

| | Accession Number | Chromosomal Location | Protein Length | Identity to | E-Value |
| TGOLN2 Isoform 1 | NP_006455.2 | 2p11.2 | 437 aa | 43% | 1.0^{-9} |

=== Orthologs ===
The human C5orf15 protein has orthologs in vertebrates dating back to jawless fish, but no further orthologs were found in invertebrates, such as sea urchins, lancelets, and tunicates. Orthologs were present in the classifications of mammalia, aves, reptilia, amphibia, and bony, cartilaginous, and jawless fish.

=== Evolutionary divergence ===
The C5orf15 protein evolution scatter plot displays the relationship between the approximate divergence time (in millions of years ago) and the corrected percent divergence (m) for each species' orthologous protein compared to the human protein. Reference lines for fibrinogen alpha, hemoglobin subunit alpha, and cytochrome c shows the relative evolutionary speed of C5orf15. The C5orf15 protein evolves faster than all three reference sequences.

=== Multiple sequence alignment ===
Two multiple sequence alignments show high conservation of the C-terminus region of the C5orf15 protein.

== Clinical significance ==
C5orf15 is found to have a two-fold increase of activity when the β-catenin gene within BxPC-3 cells is disrupted. This is thought to be a product of reduced cell to cell signaling pathways in BxPC-3 cells when the WNT signaling protein was fully deficient of the β-catenin gene.

C5orf15 was correlated with one of three structural aberrations found in 8% of the genomic profiles of acral, mucosal, and vulvovaginal melanomas. Specifically, a fusion was observed between C5orf15 and Ras gene in acral melanomas.

Through GEPIA, an analytical genomic tool, elevated expression of C5orf15 was identified in tumor tissues compared to normal tissues. A higher expression of C5orf15 was related to worse survival outcomes in head and neck squamous cell carcinoma patients. The C5orf15 gene has not been identified to be significantly related to risk-prone immune cell types.
