= Fiber-Seq =

Fiber-seq is a molecular biology multiplexed sequencing assay capturing genomic and epigenomic information for individual chromatin fibers.  It achieves this by combining labelling of accessible DNA with long-read sequencing to map cis-regulatory elements onto the DNA template.

== Background ==

Fiber-seq is a novel method in chromatin biology designed to map chromatin fibers onto their underlying DNA template to capture cis-regulatory architectures revealing chromatin structure at near single nucleotide resolution. Fiber-seq utilizes the non-specific DNA N6-adenine methyltransferase Hia5 to precisely stencil the structure of chromatin fibers onto their DNA templates. Adenines in the DNA sequence that are not protected by a bound protein are methylated, creating a DNA stencil of protein occupancy. These DNA stencils are then sequenced through single-molecule long-read circular consensus sequencing. Regions of DNA with non-methylated adenines indicate binding of proteins, and the protein type can be inferred by the size of the unmethylated 'footprint' as well as the underlying DNA sequence. Readout of Fiber-seq includes the DNA sequence, endogenous DNA methylation, and protein occupancy footprints, revealing the primary architecture of multikilo-base segments of individual chromatin fibers. There are existing Fiber-seq data analysis pipelines publicly available such as Fibertools to perform quality control and infer regulatory elements. The Fiber-seq workflow involves the following steps:

1. Treatment of extracted cell nuclei with non-specific m6A-MTase Hia5 + methyl donor SAM
2. PCR-free library construction on high-molecular weight DNA extracted from nuclei
3. Single-molecule circular consensus sequencing
4. Data processing and analysis

=== Development and history ===
Fiber-seq was first developed and published by Stergachis et al. in the 2020 paper, "Single-molecule regulatory architectures captured by chromatin fiber sequencing." Its breakthrough is in coupling the use of non-specific DNA N6-adenine methyltransferases to stencil the chromatin fiber architecture onto the underlying DNA with long-read single-molecule sequencing technology. Together, this generates multiple layers of genomic data including the DNA sequence, endogenous DNA methylation states, and a protein occupancy map. While both major components of this method, exogenous DNA methylation to tag accessible chromatin and single-molecule long-read sequencing,existed prior to Fiber-seq, its novelty is integrating these components together.

=== Distinction from existing methods ===
Many chromatin profiling methods fragment chromatin prior to sequencing through enzymatic digestion, transposition, or sonication. Short-read chromatin sequencing methods that fragment DNA prior to sequencing prevent recovery of long-range regulatory architectures present on individual chromatin fibers. For example, ChIP-seq, CUT&RUN, and CUT&Tag rely on antibodies and generate short fragments, while ATAC-seq, DNase-seq, and MNase-seq depend on enzymatic cleavage of accessible DNA. NOMe-seq similarly uses an exogenous methyltransferase to label accessible chromatin, but relies on bisulfite conversion and short-read sequencing, limiting long-range single-molecule resolution. Fiber-seq is distinct in its use of a non-specific adenine methyltransferase (m6A-MTase) to methylate accessible adenine nucleotides coupled with long-read sequencing. Fiber-seq does not require immunoprecipitation, targeted cleavage, or intentional fragmentation prior to sequencing, enabling direct reconstruction of multi-kilobase regulatory architectures on individual DNA molecules.

== Scientific principles ==
- Chromatin architecture and gene expression: In every eukaryotic cell, DNA must be highly compacted to fit into the nucleus. This high level of compaction means that at any given time, only a small portion of the genome is accessible to transcription machinery. Gene accessibility is dynamic and influenced by multiple factors, including the structure of chromatin, which has several levels of organization. The fundamental repeating unit is the nucleosome, comprising 146 base pairs of DNA wrapped around a histone octamer, often referred to as the nucleosome array or 11 nm fiber. Gene expression can be dynamically modulated by histone modifications, DNA methylation, histone variants, transcription factor binding, and nucleosome positioning. Depending on how accessible the DNA is, regions of chromatin can be classified as euchromatin or heterochromatin . This complex primary chromatin architecture of cis-regulatory elements can be captured by Fiber-seq at single-molecule resolution.
- Hia5 enzyme: A key part of the fiber-seq workflow is incubating nuclei with the enzyme Hia5 and SAM, a methyl donor. Hia5 is a non-sequence-specific adenine methyltransferase that modifies adenine residues to N6-methyl adenine. In animal genomes, adenines occur at an average frequency approaching one in every two base pairs and are typically devoid of methylation. Therefore, Hia5 can be used to accurately mark regions of DNA that are not protected by protein by using methyl groups donated by SAM and catalyzing exogenous methylation of adenines.
- Long-read sequencing and CCS: Long-read sequencing describes modern DNA sequencing methods that allow for single molecule resolution of long DNA fragments typically in the range of tens of kilobases, with some ultralong sequencing platforms able to sequence up to 1 Mb. Multiple methods of long-read sequencing exist, but for the foundational Fiber-seq experiments, circular consensus sequencing (CCS) was used. CCS involves circularizing long fragments of native DNA with epigenetic modifications still intact and sequencing them several times in order to generate a consensus sequence that has extremely high accuracy. The long-read aspect of Fiber-seq is what allows for the novel information of cis-regulatory architecture along individual chromatin fibers that underlies gene activation or repression.

== Properties ==
One of the main properties of Fiber-seq is that it does not use nucleases unlike other common methods for mapping chromatin like DNAse-seq which uses the nuclease DNAse I.  Using Hia5, a non-specific N6-adenine DNA methyltransferases, Fiber-seq marks accessible DNA regions without fragmenting and compromising the structure of the DNA molecule.

To complement this, Fiber-seq utilizes CCS, a high-fidelity long-read sequencing method with nucleotide resolution. This method is able to distinguish between methylated and unmethylated adenine residues based on the kinetics of the DNA polymerase used during sequencing.  As such, Fiber-seq provides high base calling accuracy across chromatin fibers averaging 10 - 15 kilobases in length.

Another feature of Fiber-seq is single-molecule resolution. This method is able to map the primary chromatin architecture of individual DNA fibers rather than an average of large chromatin fiber populations needed for other methods.

By integrating non-destructive enzymatic labeling with high-fidelity long-read sequencing, Fiber-seq provides a multi-layered and continuous map of the genome that resolves individual chromatin architectures. However, one limitation of Fiber-seq is that while it is able to observe regions of protein binding, it is unable to identify which proteins bound the DNA at that region.

== Applications ==
=== Transcription factor footprinting ===
Fiber-seq can detect where transcription factors are bound to DNA, shielding the region from methylation. This allows for the investigation of the effect of specific TF binding and long-range chromatin interactions to study how transient TF binding impacts regulatory DNA accessibility. For example, Grasberger et al. utilized Fiber-seq to study thyroid specimens from participants with resistance to thyrotropin, and revealed that mutated short tandem repeats was associated with an increase in TF footprinting at a thyroid-specific enhancer cluster.

=== Chromatin mapping of complex tissues ===
In complex tissues, other chromatin mapping methods like DNase-seq and ATAC-seq have cleavage biases, decreased resolution, and overestimated size calculation of nucleosome-depleted sequences.  To overcome these challenges, Peter et al. applied Fiber-seq in sequencing neuronal and non-neuronal cells from human brain tissue to identify 20,000 accessible chromatin regions that could not be mapped using short-read epigenomic sequencing."Fiber-seq allows resolution of individual nucleosomes (147 bp) and internucleosomal linker DNA on the single chromatin fibers, which enabled us to confirm principles of nucleosomal organization hitherto elusive to assess in complex tissues and established only in simple eukaryotes and cells in culture."

=== Chromatin mapping of highly repetitive sequences ===
By using long-read sequencing, Fiber-seq has improved accuracy in mapping repetitive genomic regions. In a study of the maize genome, researchers using Fiber-seq were able to identify and map a significantly greater number of repetitive transposable elements (TEs) compared to ATAC-seq. In another study on centromeres, which are flanked by highly repetitive regions, Dubocanin et al. were able to use Fiber-seq to identify chromatin architectures within centromeres at single-molecule and near-single-nucleotide resolution.
