= Glutamate-rich protein 4 =

Glutamate-rich protein 4 is encoded by the gene ERICH4 and can be otherwise known as chromosome 19 open reading frame 69 (C19orf69). ERICH4 is highly conserved in mammals and exhibits overexpression in tissues of the kidneys, terminal ileum, and duodenum. The function of ERICH4 has yet to be well understood by the scientific community but is suggested to contribute to immune inflammatory responses.

==Gene==
ERICH4 is located on the sense strand of 19q13.2 in humans, consists of 2,340 base pairs, and contains 2 exons. ERICH4, on the sense strand, is located within DMAC2 and next PCAT19 and B3NT8 which are all on the antisense strand.

===Promoter & Predicted Transcription Factors (TF)===
The promoter is predicted to begin 1,806 bp upstream from the 5' UTR and consists of 1,819 bp which overlaps with the coding sequence by 13 bp.

| Matrix ID | TF Name | Genomic Position with Human ERICH4 Promoter | Strand of q19 | Matrix Similarity | Literature Supported Function |
| V$CEBPA.01 | CCAAT/Enhancer-binding Protein Alpha | 41,441,752-41,441,766 | Sense (+) | 0.962 | Recruit co-activators that in turn can open up chromatin structure or recruit basal transcription factors. |
| O$VTATA.01 | Vertebrate TATA-binding Factor | 41,442,466-41,442,482 | Antisense (-) | 0.915 | Required for initiation of transcription and is associated with a variety of different transcription factors. |
| V$SOX1.04 | SRY (Sex Determining Region)-Box 1 | 41,442,480-41,442,502 | Sense (+) | 0.801 | Involved in the regulation of embryonic development and in the determination of the cell fate. |
| V$HSF1.04 | Heat Shock Factor 1 | 41,442,536-41,442,560 | Sense (+) | 0.769 | Activation in cellular stress. |
| V$BTEB3.01 | Krueppel-like Factor 13 (KLF13) | 41,442,243-41,442,261 | Sense (+) | 0.934 | KLF13 knock-out mice show a defect in lymphocyte survival as KLF13 is a regulator of Bcl-xL expression. |
| V$PAX5.01 | B-cell Specific Activator Protein | 41,442,955-41,442,983 | Sense (+) | 0.796 | Key role in B-lymphocyte development. |
| V$GLI3.02 | GLI-Kruppel Family Member GLI3 | 41,443,115-41,443,131 | Sense (+) | 0.915 | Thought to play a role during embryogenesis. |
| V$NR2F6.01 | Nuclear Receptor subfamily 2 group F member 6 (NR2F6) | 41,442,507-41,442,531 | Sense (+) | 0.851 | Transcriptional repressor of IL17 expression in Th17-differentiated CD4-positive T cells in-vitro and in-vivo. |
| V$MAFB.01 | MAFB/Leucine Zipper Transcription Factor | 41,442,601-41,442,625 | Sense (+) | 0.923 | Regulation of lineage-specific hematopoiesis. Represses ETS1-mediated transcription of erythroid-specific genes in myeloid cells. |
| V$AP4.03 | Activating Enhancer Binding Protein 4 (TFAP4) | 41,443,003-41,443,019 | Sense (+) | 0.993 | Regulates the expression of genes involved in the regulation of cellular proliferation, stemness, and epithelial-mesenchymal transition. |
| V$EVI1.05 | Ecotropic Viral Integration Site 1 (EVI1) Encoded Factor | 41,442,583-41,442,599 | Sense (+) | 0.821 | Regulation of hematopoietic stem cell renewal. Controls several aspects of embryonic development. |
| V$MRE.01 | Mineralcorticoid Receptor Response Element | 41,442,844-41,442,862 | Sense (+) | 0.939 | Involved in water electrolyte homeostasis, blood pressure regulation, inflammation, and fibrosis in the renocardiovascular system. |
| V$ARE.03 | Androgene Receptor Binding Site, IR3 Sites | 41,442,844-41,442,862 | Antisense (-) | 0.946 | Ligand-dependent transcription factor that controls the expression of specific genes. The binding of the AR to its native ligands 5α-dihydrotestosterone (DHT) and testosterone initiates male sexual development and differentiation. |
| V$ZNF217.01 | Zinc Finger Protein 217 | 41,443,023-41,443,035 | Sense (+) | 0.911 | Promotes cell proliferation and antagonizes cell death. |
| V$RORA.02 | RAR-related Orphan Receptor Alpha, Homodimer DR5 Binding Site | 41,442,783-41,442,807 | Sense (+) | 0.831 | Possible role in lymphocyte development. Possible function in negatively regulating inflammation due to a report of positive relation in the expression of IKBa, a negative regulator of the NF-kB signaling pathway. |
| V$STAT6.01 | Signal Transducer and Activator of Transcription 6 (STAT6) | 41,443,042-41,443,060 | Sense (+) | 0.961 | Plays a central role in exerting IL4-mediated responses. |
| V$ZF5.01 | Zinc Finger/POZ Transcription Factor | 41,442,874-41,442,888 | Sense (+) | 0.957 | Role in development, oncogenesis, apoptosis, and transcription repression. |

==mRNA==

The ERICH4 mRNA sequence is 955 nucleotides in length with a fold energy predicted as -139.80 kcal/mol with -0.258 energy/base.

===Alternative Splicing===
ERICH4 has one different protein-encoding transcript variant, or isoform.

| Name | mRNA Length (bp) | Protein Length (aa) | Mass (Da) |
| Glutamate-rich protein 4 | 955 | 130 | 14,447 |
| Glutamate-rich protein 4 isoform X1 | 1741 | 155 | N/A |

==Protein==
===General Properties===
The primary encoded protein consists of 130 amino acids and has a predicted molecular mass of 14.5 kDa and isoelectric point of 4 pI. As suggested by the protein's name, glutamate-rich protein 4, the protein is most highly composed of glutamic acid amino acids at 17.7% of the protein's composition followed by leucine at 14.6%, and then proline at 9.2%. ERICH4 has no positive or negative charge clusters. The human protein has one identifiable mixed cluster from amino acid 91 to 116 with 3 positively-charged, 15 negatively-charged, and 8 neutral amino acids. The same mixed cluster region in humans is frequently negative within ERICH4's orthologous proteins. This protein contains no significant hydrophobic or transmembrane segments which are supported with comparison to five of ERICH4's orthologs (Graymouse lemur, Sheep, House mouse, African elephant, and Opossum).

===Domains===
ERICH4 has one identified domain of unknown function, DUF4530, which is found in eukaryotes. Proteins in this family are typically 140 amino acids in length and ERICH4 is a known human member of this family.

===Secondary Structure===
A cross-program analysis determines ERICH4 protein to be composed of five separated alpha helixes and five interspersing coils. The alpha helix segments span from amino acids 2-9, 21-24, 47-58, 61-94, and 104-111 in the protein sequence. ERICH4 is not predicted to contain beta-sheets.

===Tertiary Structure===
Program analysis in SWISS-Model proposes a tertiary structure for ERICH4 by matching the protein against the template of NLRP6 with a sequence identity of 25.79%, sequence similarity of 0.30, and coverage of 0.43 for amino acids 43-92 in ERICH4.

===Post-translational Regulation===
ERICH4 has proposed phosphorylation at serine amino acids 28 and 96 and amino acid 36, a threonine, by casein kinase II and protein kinase c, respectively. ERICH4 is not predicted to be undergo a methionine cleavage or acetylation.

===Localization===
This protein is predicted to be intracellular without any transmembrane regions. Sub-cellular localization is predicted to be mostly localized to the cytoplasm with a reliability score of 70.6 via the Reinhardt's method. No significant O-GlcNAc site and N-myristoylation predictions.

==Tissue Expression==
ERICH4’s highest levels of expression are within human tissue of the duodenum and small intestine, followed by the kidneys. Notably, expression within the small intestines is highest in the twentieth week of human fetal development. Within a representative set of mouse (Mus musculus) tissues, Erich4 is most highly expressed within the kidneys, followed by and in decreasing expression, the large intestines, adult duodenum, and adult small intestine. The Sigma-Aldrich antibody product, HPA042632, derived from rabbit, has a strong granular cytoplasmic positivity in cytoplasmic structure in glandular cells (goblet cells) of the rectum.

===Abnormal Tissue Expression===
ERICH4 has high expression within normal tissue and low-to-medium expression with renal cell carcinoma tissue.

An analysis examining ERICH4 was reviewed in tissues of the ileum and colon that were either normal or afflicted with Crohn's disease or ulcerative colitis. ERICH4 had high (~90%) expression within the ileum for all states (normal/control, Crohn's disease, and ulcerative colitis). ERICH4 also has a higher expression in Crohn's disease than in either normal tissue or ulcerative colitis.

==Function==
The function of ERICH4 has yet to be well understood by the scientific community and therefore, requires further research.

===Interactions===
According to STRING analysis, ERICH4 has multiple predicted interactions with other proteins including proteins with associated immune function and expression within the gastrointestinal tract or testes from textmining. No experimentally confirmed protein interactions yet.

| Predicted Partner Protein | Score | Associated Functions |
| Tetratricopeptide Repeat Domain 29 (TTC29) | 0.680 | Shown to be significantly upregulated during wound healing of human masticatory mucosa. |
| Transmembrane Protein 184A (TMEM184A) | 0.552 | Functions as a heparin receptor and mediates anti-inflammatory responses of ECs involving decreased JNK and p38 activity. |
| Insulin-like Growth Factor binding protein Acid Labile Subunit (IGFALS) | 0.509 | Serum protein that binds insulin-like growth factors, increasing their half-life and their vascular localization. |
| Serine Peptidase Inhibitor, Kazal type 4 (SPINK4) | 0.500 | Has been shown to exhibit Celiac disease pathology-related differential gene expression, likely derived from altered goblet cell activity. |
| Protein Disulfide-Isomerase-Like protein of the Testis (PDILT) | 0.497 | Catalyzes protein folding and thiol-disulfide interchange reactions. This protein lacks oxidoreductase activity in vitro and is suspected to function as a chaperone. |

==Homology==
===Paralogs===
No human paralogs were found for the gene.

===Orthologs===
Orthologs have been identified in most mammals for which complete genome data is available. Notably, ERICH4 orthologs are only present in placental and marsupial mammals but absent in monotremes. The most distant ortholog was identified in the gray short-tailed opossum which is a marsupial mammal.

No significant similarities were found in the vertebrates Aves, Reptilia, Amphibia, Chondrichthyes, Osteichthyes or Agnatha. Searching to exclude vertebrates in BLAST and BLAT produced no significant ortholog findings for invertebrates, fungi, and bacteria.

| Species | Common name | NCBI Accession Number | Sequence length (AA) | Millions of Years since LCA | % Identity | % Similarity | Taxonomic group |
| Homo sapiens | Human | NP_001123986.1 | 130 | --- | 100 | 100 | Primates |
| Microcebus murinus | Gray mouse lemur | XP_012616209.1 | 179 | 73 | 72 | 80 | Primates |
| Tupaia chinensis | Northern treeshrew | XP_006165343.1 | 137 | 85 | 63 | 72 | Scandentia |
| Mus pahari | Gairdner's shrewmouse | XP_021075502.1 | 141 | 88 | 57 | 63 | Rodentia |
| Meriones unguiculatus | Mongolian gerbil | XP_021519873.1 | 141 | 88 | 60 | 64 | Rodentia |
| Rattus norvegicus | Brown rat | NP_001102923.1 | 147 | 88 | 61 | 65 | Rodentia |
| Mus musculus | House mouse | NP_001034332.2 | 140 | 88 | 62 | 71 | Rodentia |
| Microtus ochrogaster | Prairie vole | XP_005361243.1 | 140 | 88 | 62 | 71 | Rodentia |
| Erinaceus europaeus | European hedgehog | XP_007536664.1 | 129 | 94 | 61 | 68 | Eulipotyphla |
| Orcinus orca | Killer whale | XP_004271419.2 | 121 | 94 | 62 | 72 | Cetacea |
| Physeter catodon | Sperm whale | XP_007128192.1 | 121 | 94 | 64 | 72 | Cetacea |
| Desmodus rotundus | Common vampire bat | XP_024433457.1 | 180 | 94 | 66 | 73 | Chiroptera |
| Ovis aries | Sheep | XP_012045823.2 | 131 | 94 | 69 | 75 | Artiodactyla |
| Bos taurus | Cattle | XP_002695042.1 | 131 | 94 | 69 | 75 | Artiodactyla |
| Pteropus alecto | Black flying fox | XP_006910763.1 | 133 | 94 | 71 | 78 | Chiroptera |
| Hipposideros armiger | Great roundleaf bat | XP_019488166.1 | 134 | 94 | 71 | 79 | Chiroptera |
| Loxodonta africana | African bush elephant | XP_003420798.1 | 127 | 102 | 58 | 66 | Proboscidea |
| Monodelphis domestica | Gray short-tailed opossum | XP_007492011.1 | 106 | 160 | 45 | 61 | Didelphimorphia |
| Phascolarctos cinereus | Koala | XP_020834126.1 | 109 | 160 | 48 | 63 | Diprotodontia |
| Vombatus ursinus | Common wombat | XP_027701859.1 | 109 | 160 | 49 | 64 | Diprotodontia |

===Molecular Evolution===
The m value, or number of corrected amino acid changes per 100 residues, for the gene ERICH4 was plotted against the divergence of species in millions of years. When compared to the data of hemoglobin, fibrinogen alpha chain, and cytochrome C, it was determined that the gene has the closest progression to fibrinogen alpha chain, suggesting a relatively rapid pace of evolution. M values for ERICH4 were derived from percentage of identity of species protein sequences compared to the human sequence using the formula derived from the Molecular Clock Hypothesis.
