Jump to content

User:Amyqdinh/sandbox

From Wikipedia, the free encyclopedia

An Error has occurred retrieving Wikidata item for infobox

FAM89A[edit]

FAM89A (Family with Sequence Similarity 89 Member A) protein is encoded by the human FAM89A gene (accession number NM_940954.1).[1] It has the alias C1orf153 which specifies its location on the human genome; located on chromosome 1 in the open reading frame 153.[1] Expression of FAM89A is found to be highest in placenta and adipose tissue.[1][2] FAM89A's function has yet to be determined, but its expression has been linked to pathologies such as atherosclerosis and glioma cell expression, the ability to diagnose bacterial infections, and is notable for its response to interleukin exposure.[3][4][5]

Gene[edit]

FAM89A’s most popular alias name C1orf153 specifies its location on the human genome; located on chromosome 1 in the open reading frame 153.[1] The gene also has less commonly used aliases MCG15887 and RP11-423F24.2 LOC37061.2.[6] FAM89A’s exact location is the minus strand of chromosome 1, map position 1q42.2, starting at 231,018,958 bp and ending at 231,040,254 bp; therefore, the gene is 21,297 base pairs long.[6] In total, FAM89A has two exons and one large intron region[7][8] and the primary transcript is 1,503 base pairs long[6]. FAM89A does not have any transcript variants.

Neighboring Genes[edit]

TRIM67 (Tripartite Motif Containing 67) is located downstream from FAM89A on chromosome 1 while ARV1 (Acyl-coA aceyltransferase-related enzyme 2 required for viability) is located upstream of FAM89A.[9][10] Both are on the plus strand of chromosome 1.

Protein[edit]

General Properties[edit]

FAM89A protein (NP_940954.1) is 184 amino acids long.[1] Its predicted molecular mass is 18.6kDa and predicted isoelectric point (pI) is 5.64.[11] Two small repetitive sequences were found twice within the protein sequence; GARAA and ASGG.[12] Composition of FAM89A protein is notable for its abundance of four amino acids; Leucine (14.1%), Glycine (12.0%), Alanine (11.4%) and Serine (11.4%).[12] Composition analysis also computed that the value of basic amino acids minus acidic amino acids (KR-ED) is a value of -3, signifying that there are 3 more acids than bases which supports the slightly acidic isoelectric point mentioned previously.[12]

FAM89A Proposed Tertiary Structure. Confidence score of 2.23. The c-score is calculated based on the significance of threading template alignments. From I-TASSER 3D Structure Prediction Software[13]

Conserved Domains & Motifs[edit]

Five periodic repeats of leucine residue at every seventh amino acid position from base pairs 81 until 115 is characteristic of a leucine zipper structural motif.[14][15] MotifFinder tool determined the region of 84-122 bp to encode a leucine-rich adapter protein (LURAP) called PF14854.[15] LURAP superfamily of proteins activate the canonical NF-kappa-B pathway,  promotes proinflammatory cytokine production, and promotes the antigen presenting and priming functions of dendritic cells.[15]

Structure[edit]

Secondary Structure[edit]

FAM89A protein is predicted to be composed of an estimate of 40% alpha-helices, 11% extended stands, and 49% random coils.[16][17][18][19] No transmembrane helices or N-terminal signal peptide exist for the protein.[14] The LURAP domain is predicted to form an alpha helix.

Tertiary Structure[edit]

The tertiary structure of FAM89A protein is not yet well understood due to the lack of testing with X-ray crystallography. I-TASSER software predicts that the protein has a dimerization of alpha helix monomers which is indicative of the leucine zipper motif.[20][21][22]

Interacting Proteins[edit]

STRING Interaction Network has experimentally observed interaction between FAM89A and UBXN2B (UBX domain-containing protein 2B)[23]. UBXN2B is an adapter protein that is required for Golgi and endoplasmic reticulum biogenesis.[24] It is involved in Golgi and endoplasmic reticulum maintenance during interphase of the cell cycle and in reassembly at the end of mitosis.[23][24]

Microarray hybridization data shows that FAM89A and Interluekin 13 may be linked to one another. FAM89A's airway epithelial cells were exposed to IL-13 in vitro, and the response was a decrease in expression.[25] FAM89A expression also decreased when CD8+ T lymphocyte was exposed to Interluekin 10.[1]


FAM89A gene expression in response to BRAF inhibition using vemurafenib in the melanoma (skin cancer that occurs in melanocytes) cell line was found to decrease.[26]

FAM89A promoter sequence annotated with predicted transcription binding sites.

Gene Level Regulation[edit]

Promoter & Transcription Factor Binding Sites[edit]

Genomatrix ElDorado genome annotation tool identifies the length of FAM89A's promoter to be 1,104 base pairs long.[27] Various transcription factors were found within including TFIIB (RNA-Polymerase II transcription binding factor IIB), MZF1 (myeloid zinc finger 1 factors), and SPI (GC-Box factors SP1/GC).[8][27]

Expression Pattern[edit]

Tissue Expression[edit]

Tissues within the human body that have the highest levels of FAM89A expression are the placenta and adipose tissue.[1][2] Moderate levels of expression can be found in the adrenal gland, lungs, and breasts.[1] Microarray hybridization patterns further support high FAM89A expression levels in the placenta but additionally notes moderate expression in the lungs, skin, spleen, spinal cord, pancreas, and retina[28][29]

Protein Level Regulation[edit]

Immunofluorescent staining of human cell line RH-30 shows FAM89A localization to the nucleoplasm, Golgi apparatus, and vesicles of the cell. Image from HPA (Human Protein Atlas).[30]

Protein Localization[edit]

Immunofluorescent staining of the human cell line RH-30 from the Human Protein Atlas (HPA) shows localization of FAM89A to the nucleoplasm, Golgi apparatus, and vesicles of the cell.[30] Reinhardt’s method for cytoplasmic/nuclear discrimination in PSORT II search results predict nuclear localization with a reliability score of 89.[14] Prediction for localization of FAM89A is highest in the nucleus (52.2%) followed by the mitochondria (34.8%), then the cytoskeleton (8.7%), followed by the cytoplasm having the lowest score (4.3%).[14] PredictProtein tool supports the prediction of subcellular localization in the nucleus.[31]

Post-Translational Modifications[edit]

Phosphorylation[edit]

FAM89A has possible phosphorylation at 13 serine amino acids in its protein sequences according to NetPhos. These phosphorylations are predicted to occur at position 30, 32, 37, 58, 65, 106, 117, 129, 148, 150, 168, 173, and 175 by the kinases PKA, CDC2, CKI, CKII, PKC, P38MAPK, SRC, EGFR, and DNAPK.[32] GPS program also is a useful tool that identifies possible phosphorylation sites with their cognate protein kinases. GPS results for FAM89A protein produced 1175 hits, thus over-predicting phosphorylation sites.[33] SIB MyHits Motif Scan results include casein kinase II phosphorylation predictions for 6 amino acid sites but gives vague amino acid range predictions that are ranked with a question mark (?), signifying questionable or weak matches.[34]

GalNac O-glycosylation & O-linked β-N-acetylglucosamine[edit]

NetOGlyc analyzation searched for mammalian mucin type GalNAc O-glycosylation sites and predicted five positive results at amino acids 2, 39, 154, 168, and 173.[35] Mucins are a group of heavily O-glycosylated proteins that line the GI and respiratory tract to protect them from infection; they serve a protective function as they lubricate these tracts to prevent bacteria from binding.[36] It is also important to note that O-GalNAc modifications may compete with phosphorylation for control of a protein’s activation site.

The tool YinOYang was able to predict five possible O-beta-GlcNAc attachment sites in FAM89A protein at serine amino acids 2 and 172 (+++ confidence; +0.45 potential) and also at 129, 168, and 173 (++ confidence; +0.6 potential).[37]

Glycation[edit]

Glycation of epsilon amino groups of lysine were analyzed for in FAM89A protein, and three results were found to predict the attachment of monosaccharides at lysine 57, lysine 82, and lysine 95.[38] These residues are conserved in distant orthologs. Glycation of these lysines is linked to being an important factor in atherosclerosis due to its production of advanced glycation end products (AGEs) which are engulfed by macrophages and taken into the arterial wall.[39]

SUMOylation[edit]

SUMP plot analysis program predicts SUMO (Small-Ubiquitin-like Modifier) protein sites at position 83. The residue is conserved in distant orthologs.

Homology/Evolution[edit]

Phylogenetic tree outline created with GeneBee to represent the relatedness between orthologous species. Species names, photos, relatedness, and families are featured.

Paralogs[edit]

FAM89A is known to have two paralogs; FAM89B and TRANK1.[1] FAM89B is located on human chromosome 11 at map position 11q13.1 and has the common aliases, Leucine Repeat Adaptor Protein 25 (LRAP25) and Mammary Tumor Virus Receptor Homolog 1 (MTVR1).[9][40] TRANK1 (Tetratricopeptide Repeat and Ankyrin Repeat Containing 1) also goes by the alias of LBA1 and is located on human chromosome 8 at map position 3p22.2[41]. FAM89B is more closely related to FAM89A with a 92.31% similarity while TRANK 1 is distantly related with only a similarity of 3.00%. Paralogs of FAM89A were likely to split around 740 million years ago.[42]

Orthologs[edit]

FAM89A orthologs can be found in mammals, amphibians, reptiles, birds, fish, and various insects[1]. FAM89A is conserved all the way back to cartilaginous fish which diverged from homo sapiens 465 million years ago.

FAM89A Chart of Orthologs[1]
Genus & Species Common Name Median Date of Divergence (DoD) from Human Lineage (MYA) Accession # % Identity with Homo sapiens
Homo sapiens Human 0 NP_940954.1 100
Cricetulus griseus Chinese Hamster 89 XP_016832266.1 69.9
Microtus ochrogaster Prairie Vole 89 XP_005345937.1 82.1
Manis javanica Sunda Pangolin 94 XP_017514775.1 53.1
Lagenorhynchus obliguidens Pacific White-Sided Dolphin 94 XP_026949833.1 71.8
Rousettus aegyptiacus Egyptian Fruit Bat 94 XP_016018927.1 75.5
Sus scrofa Wild Boar 94 XP_001924566.2 83.7
Ovis aries Sheep 94 XP_004021399.2 86.4
Zalophus californianus California Sea Lion 94 XP_027452929.1 90.22
Loxodonta africana African Bush Elephant 102 XP_023404912.1 58.4
Elephantulus edwardii Cape Elephant Schrew 102 XP_006894661.1 75.17
Sarcophilus harrisii Tasmanian Devil 160 XP_031822746.1 79.8
Ornithorhynchus anatinus Platypus 180 XP_028903236.1 60.7
Rhinatrema bivittatum Two-Lined Caecilian 351.7 XP_029450056.1 58.9
Nanorana parkeri High Himalaya Frog 351.7 XP_018425297.1 64.6
Danio rerio Zebrafish 433 XP_017208201.2 35.1
Seriola demerili Greater Amberjack 433 XP_022625834.1 46.2
Chanos chanos Milkfish 433 XP_030638319.1 47.7
Lepisosteus oculatus Spotted Gar 433 XP_006625723.1 48.2
Callorhinchus milii Elephant shark 465 XP_007893339.1 54.4

Evolutionary Divergence[edit]

Date of Divergence vs Corrected % Divergence Graph. FAM89A protein (blue circles), Cytochrome C (orange triangles), and Fibrinogen Alpha Chain (green squares) data was gathered from Homo sapiens (human), Cricetulus griseus (rodent), Nanorana parkeri (frog), and Danio rerio (fish) and used to create a scatter plot. Lines of best fit included for each set of data.

From a Date of Divergence vs M (amino acid changes/100 residues) graph, it can be determined that FAM89A’s line of best fit falls closer to the line of Fibrinogen Alpha Chain, a rapidly evolving gene, rather than Cytochrome C, a slowly evolving gene. The slope of FAM89A's line in the graph is almost identical to that of Fibrinogen Alpha Chain’s. These results bring forward the assumption that FAM89A is more likely diverging along with Fibrinogen Alpha Chain than Cytochrome C and therefore is diverging at a rapid rate of mutation.

Clinical Significance[edit]

Pathology & Disease Association[edit]

Research studies that investigate FAM89A are limited due to lack of knowledge in FAM89A protein’s function(s), but current understanding is that FAM89A could possibly be linked to atherosclerosis[3], methylation sites that causes gliomas[4], and the ability to diagnose bacterial infections[5]. By filling this gap in knowledge, a deeper understanding of why gene expression causes disabilities and disorders can be achieved, and possible application of this knowledge can advance studies in various fields of science and health.

In 2014, a study was published on the possible linkage of atherosclerosis caused by smoking with particular gene variants specific to the Hispanic population. FAM89A was identified to be a nearby gene to an SNP that revealed an interaction with smoking on carotid plaque area in a discovery sample. Of the 11 SNP's (single nucleotide polymorphisms) identified to cause atherosclerosis, 1 of them is located within the FAM89A gene; SNP (rs6700792). The authors conclude that more studies are needed to clarity of the role of the protein since there is no information regarding functionality of the FAM89A gene in humans.[3]

A 2019 study concerning FAM89A was directed on genes that possess methylation sites that relate to causing gliomas. The researchers found that abnormal expression of FAM89A correlated with glioma gene expression profiling studies.[4]

Another study involving FAM89A was published in 2019 regarding FAM89A and the gene IFI44L working in partnership to assist in differentiating viral and bacterial infections in febrile children. The researchers found that while ILFI44L gene has elevated expression in viral febrile children, FAM89A gene expression was elevated in febrile children with bacterial infections.[5]

References[edit]

  1. ^ a b c d e f g h i j "FAM89A family with sequence similarity 89 member A [ Homo sapiens (human) ]". NCBI Gene.
  2. ^ a b "Tissue expression of FAM89A - Summary - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2020-05-03.
  3. ^ a b c Della-Morte, David; Wang, Liyong; Beecham, Ashley; Blanton, Susan H.; Zhao, Hongyu; Sacco, Ralph L.; Rundek, Tatjana; Dong, Chuanhui (2014-09). "Novel genetic variants modify the effect of smoking on carotid plaque burden in Hispanics". Journal of the Neurological Sciences. 344 (1–2): 27–31. doi:10.1016/j.jns.2014.06.006. PMC 4143440. PMID 24954085. {{cite journal}}: Check date values in: |date= (help)CS1 maint: PMC format (link)
  4. ^ a b c Pan, XiaoYong; Zeng, Tao; Yuan, Fei; Zhang, Yu-Hang; Chen, Lei; Zhu, LiuCun; Wan, SiBao; Huang, Tao; Cai, Yu-Dong (2019-11-14). "Screening of Methylation Signature and Gene Functions Associated With the Subtypes of Isocitrate Dehydrogenase-Mutation Gliomas". Frontiers in Bioengineering and Biotechnology. 7: 339. doi:10.3389/fbioe.2019.00339. ISSN 2296-4185. PMC 6871504. PMID 31803734.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  5. ^ a b c Gómez-Carballa, Alberto; Cebey-López, Miriam; Pardo-Seco, Jacobo; Barral-Arca, Ruth; Rivero-Calle, Irene; Pischedda, Sara; Currás-Tuala, María José; Gómez-Rial, José; Barros, Francisco; Martinón-Torres, Federico; Salas, Antonio (2019-08-13). "A qPCR expression assay of IFI44L gene differentiates viral from bacterial infections in febrile children". Scientific Reports. 9 (1): 1–12. doi:10.1038/s41598-019-48162-9. ISSN 2045-2322. PMC 6692396. PMID 31409879.{{cite journal}}: CS1 maint: PMC format (link)
  6. ^ a b c "Homo sapiens gene FAM89A, encoding family with sequence similarity 89, member A." NCBI AceView.
  7. ^ "Human hg38 chr1:231,018,958-231,040,254 UCSC Genome Browser v397". genome.ucsc.edu. Retrieved 2020-05-03.
  8. ^ a b "Human hg38 chr1:231,018,958-231,040,254 UCSC Genome Browser v397". genome.ucsc.edu. Retrieved 2020-05-03.
  9. ^ a b "TRIM67 Gene - GeneCards | TRI67 Protein | TRI67 Antibody". www.genecards.org. Retrieved 2020-05-03.
  10. ^ "ARV1", Wikipedia, 2020-04-14, retrieved 2020-05-03
  11. ^ "ExPASy Bioinformatics Resource Portal Compute pI/MW". ExPASy.{{cite web}}: CS1 maint: url-status (link)
  12. ^ a b c "Statistical Analysis of Protein Sequences (SAPS)".{{cite web}}: CS1 maint: url-status (link)
  13. ^ "I-TASSER results". zhanglab.ccmb.med.umich.edu. Retrieved 2020-05-03.
  14. ^ a b c d "PSORT II Protein Subcellular Location Prediction".{{cite web}}: CS1 maint: url-status (link)
  15. ^ a b c "MotifFinder".{{cite web}}: CS1 maint: url-status (link)
  16. ^ "PredictProtein Prediction of Physico-Chemical Protein Properties".{{cite web}}: CS1 maint: url-status (link)
  17. ^ "GOR4 Network Protein Sequence Analysis".{{cite web}}: CS1 maint: url-status (link)
  18. ^ "SOPMA Network Protein Sequence Analysis".{{cite web}}: CS1 maint: url-status (link)
  19. ^ "JPred Protein Secondary Structure Prediction Server".{{cite web}}: CS1 maint: url-status (link)
  20. ^ "I-TASSER results". zhanglab.ccmb.med.umich.edu. Retrieved 2020-05-03.
  21. ^ Zhang, Yang (2009). "I-TASSER: Fully automated protein structure prediction in CASP8". Proteins: Structure, Function, and Bioinformatics. 77 (S9): 100–113. doi:10.1002/prot.22588. ISSN 1097-0134. PMC 2782770. PMID 19768687.{{cite journal}}: CS1 maint: PMC format (link)
  22. ^ Roy, Ambrish; Yang, Jianyi; Zhang, Yang (2012-07-01). "COFACTOR: an accurate comparative algorithm for structure-based protein function annotation". Nucleic Acids Research. 40 (W1): W471–W477. doi:10.1093/nar/gks372. ISSN 0305-1048. PMC 3394312. PMID 22570420.{{cite journal}}: CS1 maint: PMC format (link)
  23. ^ a b "FAM89A protein (human) - STRING interaction network". string-db.org. Retrieved 2020-05-03.
  24. ^ a b "UBXN2B Gene - GeneCards | UBX2B Protein | UBX2B Antibody". www.genecards.org. Retrieved 2020-05-03.
  25. ^ "GDS4981 / ILMN_2285817". www.ncbi.nlm.nih.gov. Retrieved 2020-05-03.
  26. ^ "GDS5085 / 7925028". www.ncbi.nlm.nih.gov. Retrieved 2020-05-03.
  27. ^ a b "Genomatix: Login Page". www.genomatix.de. Retrieved 2020-05-03.
  28. ^ "NCBI GEO GS3113/211045".{{cite web}}: CS1 maint: url-status (link)
  29. ^ "NCBI GEO GDS423".{{cite web}}: CS1 maint: url-status (link)
  30. ^ a b "FAM89A - Antibodies - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2020-05-03.
  31. ^ "PredictProtein - Protein Sequence Analysis, Prediction of Structural and Functional Features". www.predictprotein.org. Retrieved 2020-05-03.
  32. ^ "NetPhos Prediction of Phosphorylation in Eukaryotes". ExPASy.{{cite web}}: CS1 maint: url-status (link)
  33. ^ "GPS Computational Prediction of Phosphorylation Sites with their Cognate Protein Kinases". ExPASy.{{cite web}}: CS1 maint: url-status (link)
  34. ^ "SIB myHits Motif Scan". ExPASy.{{cite web}}: CS1 maint: url-status (link)
  35. ^ "NetOGlyc Prediction of Mammalian Mucin Type GalNAc O-Glycosylation Sites". ExPASy.{{cite web}}: CS1 maint: url-status (link)
  36. ^ Arike, Liisa; Hansson, Gunnar C. (2016-08). "The Densely O-Glycosylated MUC2 Mucin Protects the Intestine and Provides Food for the Commensal Bacteria". Journal of Molecular Biology. 428 (16): 3221–3229. doi:10.1016/j.jmb.2016.02.010. {{cite journal}}: Check date values in: |date= (help)
  37. ^ "YinOYang O-beta-GlcNAc Attachment Site Prediction in Eukaryotes". ExPASy.{{cite web}}: CS1 maint: url-status (link)
  38. ^ "NetGlycate Prediction of Glycation of Epsilon Amino Groups of Lysine". ExPASy.{{cite web}}: CS1 maint: url-status (link)
  39. ^ Seetharaman, Shyam (2016), "The Influences of Dietary Sugar and Related Metabolic Disorders on Cognitive Aging and Dementia", Molecular Basis of Nutrition and Aging, Elsevier, pp. 331–344, doi:10.1016/b978-0-12-801816-3.00024-8., ISBN 978-0-12-801816-3, retrieved 2020-05-03 {{citation}}: Check |doi= value (help)
  40. ^ "FAM89B Gene - GeneCards | LRA25 Protein | LRA25 Antibody". www.genecards.org. Retrieved 2020-05-03.
  41. ^ "TRANK1 tetratricopeptide repeat and ankyrin repeat containing 1 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-03.
  42. ^ "TimeTree :: The Timescale of Life". www.timetree.org. Retrieved 2020-05-03.