Active regulatory genomic regions are characterized by accessible chromatin, i.e. the tightly packed genome is unwound and free to interact with transcription factors (TFs) at these places, leading to modulation of nearby gene expression. However, disentangling which regions are involved in gene regulation in which cell type is challenging, since this requires measuring the chromatin accessibility at single cell resolution instead of at the tissue or organism level. We and others have previously developed techniques to measure chromatin accessibility at single cell resolution in up to 10^4 cells. Here we introduce a new method using three levels of combinatorial indexing, uniquely labelling each cell with a distinct set of barcodes, which allows us to profile up to 1 million single cells per experiment (‘sci-ATAC-seq3’).

Figure: Mapping chromatin accessibility and gene expression at single cell resolution in the human body provides a reference map of cell types, candidate regulatory regions, cell fate regulators etc. for studying human gene regulation in vivo. Artist rendition by Dani Bergey (Click here to show mail address).
We applied this method to 15 human fetal tissues, collecting ~800,000 high-quality single cell chromatin accessibility profiles. After grouping cells by similarity of their chromatin profiles into distinct clusters or cell types, we annotated them using single cell gene expression data we generated for the same tissue samples (see companion paper by Cao et al.). We asked which motifs (‘words’ in the genetic code) found in the accessible chromatin of each cell best explain its cell type affiliation and nominated key TFs recognizing these motifs for each cell type. This revealed both known and potentially novel regulators of cell fate specification and/or maintenance. Depending on whether TF expression and the accessibility of the cognate motif was positively or negatively correlated across cell types, each TF could be putatively assigned as activator or repressor of gene expression.
Since we profiled several tissues in the same experiment, we could also compare chromatin accessibility from the same cell type across multiple tissues of origin. This revealed that whereas the chromatin landscapes of e.g. blood cell types are highly similar across organs, endothelial cells exhibit organ-specific chromatin accessibility, in line with their more specialized functions. This organ-specific accessibility appears to be controlled combinatorially by several TFs with overlapping expression patterns.
In total we detected 1.05 million accessible sites, spanning 532 Mb or 17% of the reference human genome. Measuring co-accessibility of these sites within single cells allowed us to score cell type–specific links between candidate enhancers and genes. A majority of genetic variants associated with common diseases fall into non-coding regions of the genome, making them difficult to interpret. We were able to detect enrichment of heritability for specific common human diseases within cell type–specific accessible sites, revealing which cell types might be contributing to disease progression. In addition, comparisons with existing chromatin accessibility data in corresponding adult tissues allowed us to identify fetal-specific cell subtypes and, for example, nominate the TF POU2F1 as a potential regulator of excitatory neuron development.
Taken together this study provides the largest single cell chromatin accessibility dataset published to date. This resource contributes a step towards the field’s long-term goal of establishing a deep, predictive understanding of gene regulation. The freely available data can be explored by tissue, cell type, locus, or motif using an interactive website (descartes.brotmanbaty.org).