New Publication: Conserved Facultative Heterochromatin Identifies Disease Regulatory Sequences

Graphical Abstract
We’re excited to share our latest research published in Nucleic Acids Research, led by Enakshi Sinniah from the Palpant Lab at the University of Queensland’s Institute for Molecular Bioscience. This work introduces a novel approach to identifying regulatory sequences critical for cell identity and disease by analysing conserved patterns of facultative heterochromatin across diverse cell types.

The study demonstrates that regions of the genome marked by H3K27me3 (a histone modification associated with gene silencing) in a cell-type-specific manner can reveal functionally important regulatory elements. By developing a “cellular constraint score” (CCS) that quantifies how consistently genomic regions are marked across 833 EpiMap biosamples, the team identified sequences under strong selective pressure that govern cell identity and harbor disease-causing variants.

Abstract

Identifying functional regulatory elements across the genome remains challenging, particularly for sequences governing cell-type-specific processes. Here we show that broad domains of facultative heterochromatin, marked by H3K27me3 and deposited by Polycomb Repressive Complex 2, are highly conserved across cell types and demarcate genomic regions critical for development, cell identity, and disease. We analyzed H3K27me3 across 833 human biosamples to develop cellular constraint scores (CCS) that quantify cross-cell-type H3K27me3 conservation at single-base resolution. Highly cell-constrained sequences are enriched for coding and noncoding elements governing cell fate, show evolutionary conservation across 48 mammalian species, harbor signatures of recent positive selection in humans, and overlap pathogenic variants linked to Mendelian and complex diseases. Cell-constrained long noncoding RNAs are strongly associated with developmental processes and disease. We experimentally validated one such lncRNA, RP11-175E9.1, as a key regulator of cardiac specification and differentiation. Further analyses revealed that cell-constrained variants improve fine-mapping of causal disease variants, demonstrate utility in predicting drug target success, and show superior performance to existing evolutionary constraint metrics in cancer prognosis models. This work establishes facultative heterochromatin as a powerful framework for annotating functional regulatory sequences and prioritizing disease variants across the human genome.

Key Findings

  • Cellular Constraint Score (CCS): A novel metric quantifying H3K27me3 conservation across 833 human cell types at single-base resolution
  • Evolutionary Conservation: Cell-constrained regions show remarkable conservation across 48 mammalian species
  • Disease Relevance: Strong enrichment for pathogenic ClinVar variants and improved fine-mapping of causal disease variants
  • lncRNA Discovery: Identification and validation of RP11-175E9.1 as a critical regulator of cardiac differentiation
  • Clinical Applications: Enhanced prediction of drug target success and cancer patient outcomes

Acknowledgments

Huge congratulations to Enakshi Sinniah (first author) for leading this comprehensive study, and to Nathan Palpant (senior author) for his guidance and leadership. This work exemplifies the power of integrative epigenomics in understanding genome function and disease.

Read the full open-access article: Nucleic Acids Research, Volume 53, Issue 20

Data and Code Availability: