FAIRE-Seq

From Wikipedia, the free encyclopedia
Jump to: navigation, search

FAIRE-Seq (Formaldehyde-Assisted Isolation of Regulatory Elements) is a method in molecular biology used for determining the sequences of DNA regions in the genome associated with regulatory activity.[1] The technique was developed in the laboratory of Jason D. Lieb at the University of North Carolina, Chapel Hill. In contrast to DNase-Seq, the FAIRE-Seq protocol doesn't require the permeabilization of cells or isolation of nuclei, and can analyse any cell type. In a study of seven diverse human cell types, DNase-seq and FAIRE-seq produced strong cross-validation, with each cell type having 1-2% of the human genome as open chromatin.

Workflow[edit]

The protocol is based on the fact that the formaldehyde cross-linking is more efficient in nucleosome-bound DNA than it is in nucleosome-depleted regions of the genome. This method then segregates the non cross-linked DNA that is usually found in open chromatin, which is then sequenced. The protocol consists of cross linking, phenol extraction and sequencing the DNA in aqueous phase.

FAIRE[edit]

FAIRE uses the biochemical properties of protein-bound DNA to separate nucleosome-depleted regions in the genome. Cells will be subjected to cross-linking, ensuring that the interaction between the nucleosomes and DNA are fixed. After sonication, the fragmented and fixed DNA is separated using a phenol-chloroform extraction. This method creates two phases, an organic and an aqueous phase. Due to their biochemical properties, the DNA fragments cross-linked to nucleosomes will preferentially sit in the organic phase. Nucleosome depleted or ‘open’ regions on the other hand will be found in the aqueous phase. By specifically extracting the aqueous phase, only nucleosome-depleted regions will be purified and enriched.[1]

Sequencing[edit]

FAIRE-extracted DNA fragments can be analyzed in a high-throughput way using next-generation sequencing techniques. In general, libraries are made by ligating specific adapters to the DNA fragments that allow them to cluster on a platform and be amplified resulting in the DNA sequences being read/determined, and this in parallel for millions of the DNA fragments.

Depending on the size of the genome FAIRE-seq is performed on, a minimum of reads is required to create an appropriate coverage of the data, ensuring a proper signal can be determined.[2][3] In addition, a reference or input genome, which has not been cross-linked, is often sequenced alongside to determine the level of background noise.

Note that the extracted FAIRE-fragments can be quantified in an alternative method by using quantitative PCR. However, this method does not allow a genome wide / high-throughput quantification of the extracted fragments.

Sensitivity[edit]

There are several aspects of FAIRE-seq that require attention when analysing and interpreting the data. For one, it has been stated that FAIRE-seq will have a higher coverage at enhancer regions over promoter regions.[4] This is in contrast to the alternative method of DNase-seq who is known to show a higher sensitivity towards promoter regions. In addition, FAIRE-seq has been stated to show prefers for internal introns and exons.[5] In general it is also believed that FAIRE-seq data displays a higher background level, making it a less sensitive method.[6]

Computational analysis[edit]

In a first step FAIRE-seq data are mapped to the reference genome of the model organism used.

Next, the identification of genomic regions with open chromatin, is done by using a peak calling algorithm. Different tools offer packages to do this, like ChIPOTle[7] ZINBA[8] and MACS2[9] ). ChIPOTle uses a sliding window of 300bp to identify statistically significant signals. In contrast, MACS2 identifies the enriched signal by combining the parameter callpeak with other options like 'broad', 'broad cutoff', 'no model' or 'shift'. ZINBA is a generic algorithm for detection of enrichment in short read dataset.[10] It thus helps in the accurate detection of signal in complex datasets having low signal-to noise ratio.

BedTools[11] is used to merge the enriched regions residing close to each other to form COREs (Cluster of open regulatory elements). This helps in the identification of chromatin accessible regions and gene regulation patterns which would have been undetectable otherwise, considering the lower resolution FAIRE-seq often brings with it.

Data is typically visualized as tracks (e.g. bigWig) and can be uploaded to the UCSC genome browser.[12]

The major limitation of this method, i.e. the low signal-to-noise ratio compared to other chromatin accessibility assays, makes the computational interpretation of these data very difficult.[13]

Alternative methods[edit]

There are several methods that can be used as an alternative to FAIRE-seq. DNase-seq uses the ability of the DNase I enzyme to cleave free/open/accessible DNA to identify and sequence open chromatin.[14][15] The more recently developed ATAC-seq employs the Tn5 transposase, which inserts specified fragments or transposons into accessible regions of the genome to identify and sequence open chromatin.[16]

References[edit]

  1. ^ a b Giresi, PG; Kim, J; McDaniell, RM; Iyer, VR; Lieb, JD (Jun 2007). "FAIRE (Formaldehyde-Assisted Isolation of Regulatory Elements) isolates active regulatory elements from human chromatin". Genome Research. 17 (6): 877–85. doi:10.1101/gr.5533506. PMC 1891346Freely accessible. PMID 17179217. 
  2. ^ Landt, Stephen G.; Marinov, Georgi K.; Kundaje, Anshul; Kheradpour, Pouya; Pauli, Florencia; Batzoglou, Serafim; Bernstein, Bradley E.; Bickel, Peter; Brown, James B. (2012-09-01). "ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia". Genome Research. 22 (9): 1813–1831. doi:10.1101/gr.136184.111. ISSN 1549-5469. PMC 3431496Freely accessible. PMID 22955991. 
  3. ^ Sims, David; Sudbery, Ian; Ilott, Nicholas E.; Heger, Andreas; Ponting, Chris P. "Sequencing depth and coverage: key considerations in genomic analyses". Nature Reviews Genetics. 15 (2): 121–132. doi:10.1038/nrg3642. PMID 24434847. 
  4. ^ Kumar, Vibhor; Muratani, Masafumi; Rayan, Nirmala Arul; Kraus, Petra; Lufkin, Thomas; Ng, Huck Hui; Prabhakar, Shyam (2013-07-01). "Uniform, optimal signal processing of mapped deep-sequencing data". Nature Biotechnology. 31 (7): 615–622. doi:10.1038/nbt.2596. ISSN 1546-1696. PMID 23770639. 
  5. ^ Song, Lingyun; Zhang, Zhancheng; Grasfeder, Linda L.; Boyle, Alan P.; Giresi, Paul G.; Lee, Bum-Kyu; Sheffield, Nathan C.; Gräf, Stefan; Huss, Mikael (2011-10-01). "Open chromatin defined by DNaseI and FAIRE identifies regulatory elements that shape cell-type identity". Genome Research. 21 (10): 1757–1767. doi:10.1101/gr.121541.111. ISSN 1088-9051. PMC 3202292Freely accessible. PMID 21750106. 
  6. ^ Tsompana, Maria; Buck, Michael J (2014-11-20). "Chromatin accessibility: a window into the genome". Epigenetics & Chromatin. 7 (1): 33. doi:10.1186/1756-8935-7-33. PMC 4253006Freely accessible. PMID 25473421. 
  7. ^ Buck, Michael J; Nobel, Andrew B; Lieb, Jason D (2005-01-01). "ChIPOTle: a user-friendly tool for the analysis of ChIP-chip data". Genome Biology. 6 (11): R97. doi:10.1186/gb-2005-6-11-r97. ISSN 1465-6906. PMC 1297653Freely accessible. PMID 16277752. 
  8. ^ Rashid, Naim U.; Giresi, Paul G.; Ibrahim, Joseph G.; Sun, Wei; Lieb, Jason D. (2011-01-01). "ZINBA integrates local covariates with DNA-seq data to identify broad and narrow regions of enrichment, even within amplified genomic regions". Genome Biology. 12 (7): R67. doi:10.1186/gb-2011-12-7-r67. ISSN 1474-760X. PMC 3218829Freely accessible. PMID 21787385. 
  9. ^ Zhang, Yong; Liu, Tao; Meyer, Clifford A.; Eeckhoute, Jérôme; Johnson, David S.; Bernstein, Bradley E.; Nusbaum, Chad; Myers, Richard M.; Brown, Myles (2008-01-01). "Model-based analysis of ChIP-Seq (MACS)". Genome Biology. 9 (9): R137. doi:10.1186/gb-2008-9-9-r137. ISSN 1474-760X. PMC 2592715Freely accessible. PMID 18798982. 
  10. ^ Koohy, Hashem; Down, Thomas A.; Spivakov, Mikhail; Hubbard, Tim. "A Comparison of Peak Callers Used for DNase-Seq Data". PLoS ONE. 9 (5): e96303. doi:10.1371/journal.pone.0096303. PMC 4014496Freely accessible. PMID 24810143. 
  11. ^ Quinlan, Aaron R.; Hall, Ira M. (2010-03-15). "BEDTools: a flexible suite of utilities for comparing genomic features". Bioinformatics. 26 (6): 841–842. doi:10.1093/bioinformatics/btq033. ISSN 1367-4803. PMC 2832824Freely accessible. PMID 20110278. 
  12. ^ Hinrichs, A. S.; Karolchik, D.; Baertsch, R.; Barber, G. P.; Bejerano, G.; Clawson, H.; Diekhans, M.; Furey, T. S.; Harte, R. A. (2006-01-01). "The UCSC Genome Browser Database: update 2006". Nucleic Acids Research. 34 (suppl 1): D590–D598. doi:10.1093/nar/gkj144. ISSN 0305-1048. PMC 1347506Freely accessible. PMID 16381938. 
  13. ^ Tsompana, M; Buck, MJ (2014-11-20). "Chromatin accessibility: a window into the genome". Epigenetics & Chromatin. 7 (1): 33. doi:10.1186/1756-8935-7-33. PMC 4253006Freely accessible. PMID 25473421. 
  14. ^ Boyle, Alan P.; Davis, Sean; Shulha, Hennady P.; Meltzer, Paul; Margulies, Elliott H.; Weng, Zhiping; Furey, Terrence S.; Crawford, Gregory E. (2008-01-25). "High-resolution mapping and characterization of open chromatin across the genome". Cell. 132 (2): 311–322. doi:10.1016/j.cell.2007.12.014. ISSN 1097-4172. PMC 2669738Freely accessible. PMID 18243105. 
  15. ^ Crawford, Gregory E.; Holt, Ingeborg E.; Whittle, James; Webb, Bryn D.; Tai, Denise; Davis, Sean; Margulies, Elliott H.; Chen, YiDong; Bernat, John A. (2006-01-01). "Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS)". Genome Research. 16 (1): 123–131. doi:10.1101/gr.4074106. ISSN 1088-9051. PMC 1356136Freely accessible. PMID 16344561. 
  16. ^ Buenrostro, Jason D.; Giresi, Paul G.; Zaba, Lisa C.; Chang, Howard Y.; Greenleaf, William J. (2013-12-01). "Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position". Nature Methods. 10 (12): 1213–1218. doi:10.1038/nmeth.2688. ISSN 1548-7105. PMC 3959825Freely accessible. PMID 24097267.