= FAM200C =

Family with Member 200 C (FAM200C) is a protein, which in humans is encoded by the FAM200C gene. The primary aliases of the gene are ZBED8, C5orf54, and Buster3.

== Gene ==
In the human genome, FAM200C is located on the minus strand of chromosome 5, at 5q33.3. FAM200C can be transcribed into 2 different transcript variants, which contain 3 and 2 exons, respectively.

=== Expression ===
FAM200C is expressed ubiquitously and variably in human tissues, with a 10-fold difference between the lowest and highest expression values (~0.23-2.3). FAM200C has the highest tissue expression in the ovaries, followed by the endometrium, thyroid, testis, and prostate.

== mRNA ==
FAM200C mRNA has 2 transcript variants. FAM200C Variant 1 is the longest in terms of nucleotide length, spanning 2,882 nucleotides.
  - FAM200C Transcript Variants**

| Transcript Variant | Variant Length (nt) | Accession Number | Protein | Protein Length (aa) |
| FAM200C Variant 1 | 2,882 | NM_001303251.2 | NP_001290180.1 | 594 |
| FAM200C Variant 2 | 2,808 | NM_022090.5 | NP_071373.2 | 594 |

== Protein ==
The Family with Sequence Similarity 200 Member C protein in Homo sapiens is encoded by the FAM200C gene. Protein FAM200C has a predicted molecular weight of 68326.71 Da, with a theoretical isoelectric point of 5.98. The protein is localized to the Nucleoplasm.

=== Promoter ===
Using UCSC's Genome Browser, a promoter region sequence was found. The most likely promoter region for FAM200C starts at 160,399,954 and goes to 160,400,554, with a length of 601 base pairs.

=== Protein interactions ===
  - FAM200C Predicted Functional Partners from STRING**

| Protein | Full Name | Description | Score |
| METTL21A | Methyltransferase 21A, HSPA Lysine | Enables ATPase binding activity, Hsp70 protein binding activity, and protein-lysine N-methyltransferase activity | 0.732 |
| PGBD2 | PiggyBac transposable element derived 2 | Protein-coding gene, that interacts directly with DNA. | 0.692 |
| THAP9 | THAP domain containing 9 | Enables sequence-specific DNA binding activity and transposase activity. | 0.586 |
| GIN1 | Gypsy retrotransposon integrase 1 | Predicted to enable nucleic acid binding activity. | 0.541 |
| CRLF3 | Cytokine receptor like factor 3 | Encodes a cytokine receptor-like factor that may negatively regulate cell cycle progression at the G0/G1 phase. | 0.505 |
| ZNF862 | Zinc finger protein 862 | Predicted to enable protein dimerization activity and zinc ion binding activity. | 0.501 |
| POGK | Pogo transposable element derived with KRAB domain | Contains a KRAB domain at the N-terminus and a transposase domain at the C-terminus. | 0.499 |
| PGBD1 | PiggyBac transposable element derived 1 | Belongs to the subfamily of piggyBac transposable element derived genes; expressed in the brain. | 0.489 |
| PGBD5 | PiggyBac transposable element derived 5 | Belongs to the subfamily of piggyBac transposable element derived genes. | 0.477 |
| NAIF1 | Nuclear apoptosis inducing factor 1 | Predicted to be involved in negative regulation of cell growth and regulation of mitochondrial membrane permeability involved in apoptotic process. | 0.460 |

=== Structure ===

==== Secondary structure ====
The figure "FAM200C Predicted Secondary Structure" 1 and 2 provide a Phyre2.2 model prediction of FAM200C's secondary structure. Model indicates secondary structure is predicted to be composed of 9% disordered, 45% alpha helices, and 8% beta strands.

==== Tertiary structure ====
FAM200C tertiary structure predictions available through Phyre2.2 and AlphaFold.

== Gene ontology ==
The figure titled "FAM200C Mature miRNA Sequences by Target Score" shows the 8 mature miRNA sequences for FAM200C available through text mining.
  - FAM200C Mature miRNA Sequences by Target Score**

| miRNA Name | Target Score | Seed Location | miRNA Sequence |
| hsa-miR-767-5p | 76 | 291 | UGCACCAUGGUUGUCUGAGCAUG |
| hsa-miR-627-3p | 73 | 171, 321 | UCUUUUCUUUGAGACUCACU |
| hsa-miR-550a-3p | 70 | 454 | UGUCUUACUCCCUCAGGCACAU |
| hsa-miR-4524a-3p | 65 | 64 | UGAGACAGGCUUAUGCUGCUAU |
| hsa-miR-942-5p | 62 | 326 | UCUUCUCUGUUUUGGCCAUGUG |
| hsa-miR-449c-5p | 60 | 188 | UAGGCAGUGUAUUGCUAGCGGCUGU |
| hsa-miR-34b-5p | 60 | 188 | UAGGCAGUGUCAUUAGCUGAUUG |
| hsa-miR-376c-3p | 60 | 95 | AACAUAGAGGAAAUUCCACGU |

== Evolutionary history ==
FAM200C first arose around 94 million years ago. FAM200C is part of the Ribonuclease H-like superfamily. A Swedish University for Agricultural Sciences research team led by Dr.Alexander Hayward and Dr.Awaisa Ghazal, studying the evolutionary origins of the ZBED genes published a paper in PLOS one, reporting that: "ZBED proteins, such as C5ORF54, or ZBED8, originated from domesticated hAT DNA transposons and encode regulatory proteins with diverse, fundamental functions in vertebrates."

=== Orthologs ===
FAM200C orthologs were found exclusively in mammals. The most distantly related ortholog, Talpa occidentalis, has two transcript variants.
  - Orthologs of FAM200C protein in order of increasing divergence from Homo sapiens**

| Taxonomic Class | Taxonomic Order | Genus and Species | Common Name | Date of Divergence (MYA) | Accession Number | Sequence Length (aa) | Sequence Identity (%) | Sequence Similarity (%) |
| Mammalia | Primates | Homo sapiens | Human | 0 | NP_001290180.1 | 594 | 100 | 100 |
| Mammalia | Primates | Gorilla gorilla gorilla | Gorilla | 8.6 | XP_004042980.2 | 594 | 99 | 100 |
| Mammalia | Primates | Macaca nemestrina | Pig-tailed macaque | 28.8 | XP_001084430.1 | 593 | 98 | 99 |
| Mammalia | Primates | Trachypithecus francoisi | Francois' langur | 28.8 | XP_033036286.1 | 593 | 98 | 99 |
| Mammalia | Primates | Cebus imitator | Panamanian white-faced capuchin | 43 | XP_017357604.1 | 594 | 97 | 100 |
| Mammalia | Primates | Saimiri boliviensis | Bolivian squirrel monkey | 43 | XP_074246649.1 | 633 | 97 | 100 |
| Mammalia | Primates | Otolemur garnettii | Small-eared gelago | 74 | XP_012664265.1 | 594 | 94 | 100 |
| Mammalia | Artiodactyla | Camelus dromedarius | Arabian camel | 94 | XP_010991120.3 | 594 | 94 | 100 |
| Mammalia | Perissodactyla | Equus quagga | Plains zebra | 94 | XP_046524538.1 | 641 | 94 | 100 |
| Mammalia | Carnivora | Leopardus geoffroyi | Geoffroy's cat | 94 | XP_045358866.1 | 594 | 94 | 100 |
| Mammalia | Carnivora | Mirounga leonina | Southern elephant seal | 94 | XP_034869533.1 | 594 | 94 | 100 |
| Mammalia | Carnivora | Zalophus californianus | California sea lion | 94 | XP_027462964.1 | 639 | 94 | 100 |
| Mammalia | Carnivora | Vulpes lagopus | Arctic fox | 94 | XP_041602806.1 | 593 | 94 | 100 |
| Mammalia | Carnivora | Neogale vison | American mink | 94 | XP_044086173.1 | 594 | 93 | 100 |
| Mammalia | Carnivora | Mustela lutreola | European mink | 94 | XP_059030193.1 | 549 | 93 | 100 |
| Mammalia | Chiroptera | Pteronotus mesoamericanus | Pteronotus parnellii mesoamericanus | 94 | XP_054423860.1 | 594 | 93 | 100 |
| Mammalia | Chiroptera | Molossus molossus | Pallas' mastiff bat | 94 | XP_036098001.1 | 594 | 92 | 100 |
| Mammalia | Eulipotypha | Talpa occidentalis | Iberian mole | 94 | XP_036098001.1 | 594 | 92 | 100 |

=== Paralogs ===
FAM200C has 18 paralogs. Based on target % identity to FAM200C (>20%), the three most significant paralogs are FAM200A, FAM200B, and ZBED5.
  - FAM200C Paralogs with >20% Target Identity**

| Name | Full Name | Target % Identity | Sequence Length (aa) | Accession Number | Location |
| FAM200A | Family with Sequence Similarity 200 Member A | 29.49 | 573 | ENSG00000221909 | 7:99,546,300-99,559,392:-1 |
| FAM200B | Family with Sequence Similarity 200 Member B | 29.70 | 657 | ENSG00000237765 | 4:15,681,506-15,690,447:1 |
| ZBED5 | Zinc Finger BED-type Containing 5 | 27.13 | 693 | ENSG00000236287 | 11:10,812,074-10,858,796:-1 |

=== Multiple sequence alignment ===
The figure titled "Snippet of FAM200C Orthologs Multiple Sequence Alignment" shows a snippet of the multiple sequence alignment for FAM200C orthologs. This snippet represents the conservation of FAM200C as most of the sequence is conserved.

=== Protein divergence ===
As shown in the figure titled "Informational context of FAM200C human protein...", the human FAM200C protein, is evolving slowly over time in comparison to both Fibrinogen Alpha and Cytochrome C.

== Conceptual translation ==
The figure titled "Conceptual Translation for FAM200C" shows the conceptual translation (full mRNA and amino acid sequence) of the human FAM200C transcript variant 1 showing exon boundaries, domains/motifs, polyadenylation sites, and phosphorylation sites, and a legend.

== Single nucleotide polymorphisms ==
There are 3734 single nucleotide polymorphisms catalogued in NCBI's Variation Viewer, only one of which has a clinical significance record. The only clinically significant single nucleotide polymorphism found, rs61740683, is a synonymous, single nucleotide variant with benign clinical significance.

== Clinical significance ==
FAM200C promoter could have a dual function as it starts transcription for the FAM200C gene and also an enhancer for the gene miR-146a. In a 2023 study published to the Arthritis and Rheumatology Journal for the American College of Rheumatology, researchers Xinyi Zhu, et al. discovered that when the FAM200C gene was knocked down, the expression levels for miR-146a were unaffected. This suggests that the promoter regions could also function as enhancers and regulate the expression of genes in close proximity. FAM200C is also a potential biomarker for Sarcoidosis. In a 2020 study published to the Medical Science Monitor, Min Zhao et al. identified FAM200C as a "SARC-only DEG". The researchers found that FAM200C is up-regulated in Sarcoidosis, meaning there is increased gene expression in patients with Sarcoidosis compared to patients with Pulmonary Tuberculosis and healthy control patients.
