
Genomic analyses implicate noncoding de novo variants in congenital heart disease
- Select a language for the TTS:
- UK English Female
- UK English Male
- US English Female
- US English Male
- Australian Female
- Australian Male
- Language selected: (auto detect) - EN
Play all audios:
ABSTRACT A genetic etiology is identified for one-third of patients with congenital heart disease (CHD), with 8% of cases attributable to coding de novo variants (DNVs). To assess the
contribution of noncoding DNVs to CHD, we compared genome sequences from 749 CHD probands and their parents with those from 1,611 unaffected trios. Neural network prediction of noncoding DNV
transcriptional impact identified a burden of DNVs in individuals with CHD (_n_ = 2,238 DNVs) compared to controls (_n_ = 4,177; _P_ = 8.7 × 10−4). Independent analyses of enhancers showed
an excess of DNVs in associated genes (27 genes versus 3.7 expected, _P_ = 1 × 10−5). We observed significant overlap between these transcription-based approaches (odds ratio (OR) = 2.5, 95%
confidence interval (CI) 1.1–5.0, _P_ = 5.4 × 10−3). CHD DNVs altered transcription levels in 5 of 31 enhancers assayed. Finally, we observed a DNV burden in RNA-binding-protein regulatory
sites (OR = 1.13, 95% CI 1.1–1.2, _P_ = 8.8 × 10−5). Our findings demonstrate an enrichment of potentially disruptive regulatory noncoding DNVs in a fraction of CHD at least as high as that
observed for damaging coding DNVs. Access through your institution Buy or subscribe This is a preview of subscription content, access via your institution ACCESS OPTIONS Access through your
institution Access Nature and 54 other Nature Portfolio journals Get Nature+, our best-value online-access subscription $32.99 / 30 days cancel any time Learn more Subscribe to this journal
Receive 12 print issues and online access $209.00 per year only $17.42 per issue Learn more Buy this article * Purchase on SpringerLink * Instant access to full article PDF Buy now Prices
may be subject to local taxes which are calculated during checkout ADDITIONAL ACCESS OPTIONS: * Log in * Learn about institutional subscriptions * Read our FAQs * Contact customer support
SIMILAR CONTENT BEING VIEWED BY OTHERS FUNCTIONAL DISSECTION OF HUMAN CARDIAC ENHANCERS AND NONCODING DE NOVO VARIANTS IN CONGENITAL HEART DISEASE Article 20 February 2024 COPY NUMBER
VARIATION-ASSOCIATED LNCRNAS MAY CONTRIBUTE TO THE ETIOLOGIES OF CONGENITAL HEART DISEASE Article Open access 17 February 2023 GENOMIC FRONTIERS IN CONGENITAL HEART DISEASE Article 16 July
2021 DATA AVAILABILITY Whole-genome sequencing data are deposited in the database of Genotypes and Phenotypes (dbGaP) under accession numbers phs001194.v2.p2 and phs001138.v2.p2. CODE
AVAILABILITY Documentation, links, and availability of source code and select supplementary data are detailed at https://github.com/frichter/wgs_chd_analysis. The DNV identification pipeline
is available at https://github.com/ShenLab/igv-classifier and https://github.com/frichter/dnv_pipeline. The HeartENN algorithmic framework is available at
https://github.com/FunctionLab/selene/archive/0.4.8.tar.gz. HeartENN model weights and scripts for burden tests are available at https://github.com/frichter/wgs_chd_analysis. All source code
is distributed under the Massachusetts Institute of Technology license. REFERENCES * van der Linde, D. et al. Birth prevalence of congenital heart disease worldwide. _J. Am. Coll. Cardiol._
58, 2241–2247 (2011). PubMed Google Scholar * Pediatric Cardiac Genomics Consortium et al.The Congenital Heart Disease Genetic Network Study: rationale, design, and early results. _Circ.
Res._ 112, 698–706 (2013). PubMed Central Google Scholar * Zaidi, S. et al. De novo mutations in histone-modifying genes in congenital heart disease. _Nature_ 498, 220–223 (2013). CAS
PubMed PubMed Central Google Scholar * Homsy, J. et al. De novo mutations in congenital heart disease with neurodevelopmental and other congenital anomalies. _Science_ 350, 1262–1266
(2015). CAS PubMed PubMed Central Google Scholar * Jin, S. C. et al. Contribution of rare inherited and de novo variants in 2,871 congenital heart disease probands. _Nat. Genet._ 49,
1593–1601 (2017). CAS PubMed PubMed Central Google Scholar * Fischbach, G. D. & Lord, C. The Simons Simplex Collection: a resource for identification of autism genetic risk factors.
_Neuron_ 68, 192–195 (2010). CAS PubMed Google Scholar * Garrison, E. & Marth, G. Haplotype-based variant detection from short-read sequencing. Preprint at
https://arxiv.org/abs/1207.3907v2 (2012). * Richter, F. et al. Whole genome de novo variant identification with FreeBayes and neural network approaches. Preprint at _bioRxiv_
https://doi.org/10.1101/2020.03.24.994160 (2020). * Zhou, J. et al. Whole-genome deep-learning analysis identifies contribution of noncoding mutations to autism risk. _Nat. Genet._ 51,
973–980 (2019). CAS PubMed PubMed Central Google Scholar * An, J.-Y. et al. Genome-wide de novo risk score implicates promoter variation in autism spectrum disorder. _Science_ 362,
eaat6576 (2018). PubMed PubMed Central Google Scholar * Jónsson, H. et al. Parental influence on human germline de novo mutations in 1,548 trios from Iceland. _Nature_ 549, 519–522
(2017). PubMed Google Scholar * Goldmann, J. M. et al. Parent-of-origin-specific signatures of de novo mutations. _Nat. Genet._ 48, 935–939 (2016). CAS PubMed Google Scholar * Seiden,
A. H. et al. Elucidation of de novo small insertion/deletion biology with parent-of-origin phasing. _Hum. Mutat._ 41, 800–806 (2020). CAS PubMed PubMed Central Google Scholar * Kent, W.
J. et al. The human genome browser at UCSC. _Genome Res._ 12, 996–1006 (2002). CAS PubMed PubMed Central Google Scholar * Bernstein, B. E. et al. An integrated encyclopedia of DNA
elements in the human genome. _Nature_ 489, 57–74 (2012). Google Scholar * Mei, S. et al. Cistrome Data Browser: a data portal for ChIP–Seq and chromatin accessibility data in human and
mouse. _Nucleic Acids Res._ 45, D658–D662 (2017). CAS PubMed Google Scholar * He, A. et al. Dynamic GATA4 enhancers shape the chromatin landscape central to heart development and disease.
_Nat. Commun._ 5, 4907 (2014). CAS PubMed Google Scholar * Sayed, D., Yang, Z., He, M., Pfleger, J. M. & Abdellatif, M. Acute targeting of general transcription factor IIB restricts
cardiac hypertrophy via selective inhibition of gene transcription. _Circ. Heart Fail._ 8, 138–148 (2015). CAS PubMed Google Scholar * Stefanovic, S. et al. GATA-dependent regulatory
switches establish atrioventricular canal specificity during heart development. _Nat. Commun._ 5, 3680 (2014). PubMed Google Scholar * Sayed, D., He, M., Yang, Z., Lin, L. &
Abdellatif, M. Transcriptional regulation patterns revealed by high resolution chromatin immunoprecipitation during cardiac hypertrophy. _J. Biol. Chem._ 288, 2546–2558 (2013). CAS PubMed
Google Scholar * Zhang, L. et al. KLF15 establishes the landscape of diurnal expression in the heart. _Cell Rep._ 13, 2368–2375 (2015). CAS PubMed Google Scholar * Anand, P. et al. BET
bromodomains mediate transcriptional pause release in heart failure. _Cell_ 154, 569–582 (2013). CAS PubMed PubMed Central Google Scholar * Attanasio, C. et al. Tissue-specific SMARCA4
binding at active and repressed regulatory elements during embryogenesis. _Genome Res._ 24, 920–929 (2014). CAS PubMed PubMed Central Google Scholar * Sakabe, N. J. et al. Dual
transcriptional activator and repressor roles of TBX20 regulate adult cardiac structure and function. _Hum. Mol. Genet._ 21, 2194–2204 (2012). CAS PubMed PubMed Central Google Scholar *
Consortium, R. E. et al. Integrative analysis of 111 reference human epigenomes. _Nature_ 518, 317–330 (2015). Google Scholar * May, D. et al. Large-scale discovery of enhancers from human
heart tissue. _Nat. Genet._ 44, 89–93 (2012). CAS Google Scholar * Dickel, D. E. et al. Genome-wide compendium and functional assessment of in vivo heart enhancers. _Nat. Commun._ 7, 12923
(2016). CAS PubMed PubMed Central Google Scholar * Nord, A. S. et al. Rapid and pervasive changes in genome-wide enhancer usage during mammalian development. _Cell_ 155, 1521–1531
(2013). CAS PubMed PubMed Central Google Scholar * Blow, M. J. et al. ChIP–Seq identification of weakly conserved heart enhancers. _Nat. Genet._ 42, 806–810 (2010). CAS PubMed PubMed
Central Google Scholar * Yue, F. et al. A comparative encyclopedia of DNA elements in the mouse genome. _Nature_ 515, 355–364 (2014). CAS PubMed PubMed Central Google Scholar * Shen,
Y. et al. A map of the _cis_-regulatory sequences in the mouse genome. _Nature_ 488, 116–120 (2012). CAS PubMed PubMed Central Google Scholar * van den Boogaard, M. et al. Genetic
variation in T-box binding element functionally affects _SCN5A_/_SCN10A_ enhancer. _J. Clin. Invest._ 122, 2519–2530 (2012). PubMed PubMed Central Google Scholar * Zhou, J. &
Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning–based sequence model. _Nat. Methods_ 12, 931–934 (2015). CAS PubMed PubMed Central Google Scholar * Huang,
Y.-F., Gulko, B. & Siepel, A. Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data. _Nat. Genet._ 49, 618–624 (2017). CAS PubMed
PubMed Central Google Scholar * Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. _Nat. Genet._ 46, 310–315 (2014). CAS PubMed
PubMed Central Google Scholar * Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: predicting the deleteriousness of variants throughout the human genome.
_Nucleic Acids Res._ 47, D886–D894 (2019). CAS PubMed Google Scholar * Davydov, E. V. et al. Identifying a high fraction of the human genome to be under selective constraint using GERP++.
_PLoS Comput. Biol._ 6, e1001025 (2010). PubMed PubMed Central Google Scholar * Ritchie, G. R. S., Dunham, I., Zeggini, E. & Flicek, P. Functional annotation of noncoding sequence
variants. _Nat. Methods_ 11, 294–296 (2014). CAS PubMed PubMed Central Google Scholar * Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. _Nature_ 536,
285–291 (2016). CAS PubMed PubMed Central Google Scholar * Melnikov, A., Zhang, X., Rogov, P., Wang, L. & Mikkelsen, T. S. Massively parallel reporter assays in cultured mammalian
cells. _J. Vis. Exp._ https://doi.org/10.3791/51719 (2014). Article PubMed PubMed Central Google Scholar * Werling, D. M. et al. An analytical framework for whole-genome sequence
association studies and its implications for autism spectrum disorder. _Nat. Genet._ 50, 727–736 (2018). CAS PubMed PubMed Central Google Scholar * Turner, T. N. et al. Genomic patterns
of de novo mutation in simplex autism. _Cell_ 171, 710–722.e12 (2017). CAS PubMed PubMed Central Google Scholar * C Yuen, R. K. et al. Whole genome sequencing resource identifies 18 new
candidate genes for autism spectrum disorder. _Nat. Neurosci._ 20, 602–611 (2017). PubMed Google Scholar * Hamdan, F. F. et al. High rate of recurrent de novo mutations in developmental
and epileptic encephalopathies. _Am. J. Hum. Genet._ 101, 664–685 (2017). CAS PubMed PubMed Central Google Scholar * Peacock, J. D., Lu, Y., Koch, M., Kadler, K. E. & Lincoln, J.
Temporal and spatial expression of collagens during murine atrioventricular heart valve development and maintenance. _Dev. Dyn._ 237, 3051–3058 (2008). PubMed PubMed Central Google Scholar
* Kurosaka, S. et al. Arginylation regulates myofibrils to maintain heart function and prevent dilated cardiomyopathy. _J. Mol. Cell. Cardiol._ 53, 333–341 (2012). CAS PubMed PubMed
Central Google Scholar * Kleffmann, W. et al. 5q31 microdeletions: definition of a critical region and analysis of _LRRTM2_, a candidate gene for intellectual disability. _Mol. Syndromol._
3, 68–75 (2012). CAS PubMed PubMed Central Google Scholar * Mehta, G. et al. MITF interacts with the SWI/SNF subunit, BRG1, to promote GATA4 expression in cardiac hypertrophy. _J. Mol.
Cell. Cardiol._ 88, 101–110 (2015). CAS PubMed PubMed Central Google Scholar * Tshori, S. et al. Transcription factor MITF regulates cardiac growth and hypertrophy. _J. Clin. Invest._
116, 2673–2681 (2006). CAS PubMed PubMed Central Google Scholar * Nicholson, T. B. et al. A hypomorphic lsd1 allele results in heart development defects in mice. _PLoS One_ 8, e60913
(2013). CAS PubMed PubMed Central Google Scholar * Hamidi, T. et al. Identification of Rpl29 as a major substrate of the lysine methyltransferase Set7/9. _J. Biol. Chem._ 293,
12770–12780 (2018). CAS PubMed PubMed Central Google Scholar * Siggs, O. M. et al. Mutation of _Fnip1_ is associated with B-cell deficiency, cardiomyopathy, and elevated AMPK activity.
_Proc. Natl Acad. Sci. USA_ 113, E3706–E3715 (2016). CAS PubMed PubMed Central Google Scholar * Chen, C.-Y. et al. Accumulation of the inner nuclear envelope protein Sun1 is pathogenic
in progeric and dystrophic laminopathies. _Cell_ 149, 565–577 (2012). CAS PubMed PubMed Central Google Scholar * Meinke, P. et al. Muscular dystrophy-associated _SUN1_ and _SUN2_
variants disrupt nuclear-cytoskeletal connections and myonuclear organization. _PLoS Genet._ 10, e1004605 (2014). PubMed PubMed Central Google Scholar * Röseler, S. et al. Lethal
phenotype of mice carrying a _Sept11_ null mutation. _Biol. Chem._ 392, 779–781 (2011). PubMed Google Scholar * Guo, A. et al. E–C coupling structural protein junctophilin-2 encodes a
stress-adaptive transcription regulator. _Science_ 362, eaan3303 (2018). CAS PubMed PubMed Central Google Scholar * Yamagishi, H. et al. A history and interaction of outflow progenitor
cells implicated in “Takao Syndrome.” In _Etiology and Morphogenesis of Congenital Heart Disease: From Gene Function and Cellular Interaction to Morphology_ (eds. Nakanishi, T. et al.)
201–209 (Springer, 2016). * Masuda, T. & Taniguchi, M. Congenital diseases and semaphorin signaling: overview to date of the evidence linking them. _Congenit. Anom. (Kyoto)._ 55, 26–30
(2015). CAS PubMed Google Scholar * Pierpont, M. E. et al. Genetic basis for congenital heart disease: revisited: a scientific statement from the American Heart Association. _Circulation_
138, e653–e711 (2018). PubMed PubMed Central Google Scholar * Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. _Bioinformatics_ 25,
1754–1760 (2009). CAS PubMed PubMed Central Google Scholar * McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.
_Genome Res._ 20, 1297–1303 (2010). CAS PubMed PubMed Central Google Scholar * DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA
sequencing data. _Nat. Genet._ 43, 491–498 (2011). CAS PubMed PubMed Central Google Scholar * Van der Auwera, G. et al. From FastQ data to high‐confidence variant calls: the genome
analysis toolkit best practices pipeline. _Curr. Protoc. Bioinformatics_ 43, 11.10.1–11.10.33 (2013). Google Scholar * Kim, B.-Y., Park, J. H., Jo, H.-Y., Koo, S. K. & Park, M.-H.
Optimized detection of insertions/deletions (INDELs) in whole-exome sequencing data. _PLoS One_ 12, e0182272 (2017). PubMed PubMed Central Google Scholar * Bailey, J. A., Yavor, A. M.,
Massa, H. F., Trask, B. J. & Eichler, E. E. Segmental duplications: organization and impact within the current human genome project assembly. _Genome Res._ 11, 1005–1017 (2001). CAS
PubMed PubMed Central Google Scholar * Derrien, T. et al. Fast computation and applications of genome mappability. _PLoS One_ 7, e30377 (2012). CAS PubMed PubMed Central Google Scholar
* Li, H. Toward better understanding of artifacts in variant calling from high-coverage samples. _Bioinformatics_ 30, 2843–2851 (2014). CAS PubMed PubMed Central Google Scholar *
Ostrander, B. E. P. et al. Whole-genome analysis for effective clinical diagnosis and gene discovery in early infantile epileptic encephalopathy. _NPJ Genom. Med._ 3, 22 (2018). PubMed
PubMed Central Google Scholar * Blake, J. A. et al. Mouse Genome Database (MGD)-2017: community knowledge resource for the laboratory mouse. _Nucleic Acids Res._ 45, D723–D729 (2017). CAS
PubMed Google Scholar * Chen, K. M., Cofer, E. M., Zhou, J. & Troyanskaya, O. G. et al. Selene: a PyTorch-based deep learning library for sequence data. _Nat. Methods_ 16, 315–318
(2019). CAS PubMed PubMed Central Google Scholar * Price, A. L. et al. Pooled association tests for rare variants in exon-resequencing studies. _Am. J. Hum. Genet._ 86, 832–838 (2010).
PubMed PubMed Central Google Scholar * Lian, X. et al. Directed cardiomyocyte differentiation from human pluripotent stem cells by modulating Wnt/β-catenin signaling under fully defined
conditions. _Nat. Protoc._ 8, 162–175 (2013). CAS PubMed Google Scholar * Buenrostro, J. D., Wu, B., Chang, H. Y. & Greenleaf, W. J. ATAC–seq: a method for assaying chromatin
accessibility genome-wide. _Curr. Protoc. Mol. Biol._ 109, 21.29.1–21.29.9 (2015). Google Scholar * Corces, M. R. et al. An improved ATAC–seq protocol reduces background and enables
interrogation of frozen tissues. _Nat. Methods_ 14, 959–962 (2017). CAS PubMed PubMed Central Google Scholar * Heinz, S. et al. Simple combinations of lineage-determining transcription
factors prime _cis_-regulatory elements required for macrophage and B cell identities. _Mol. Cell_ 38, 576–589 (2010). CAS PubMed PubMed Central Google Scholar * Yu, G., Wang, L.-G.
& He, Q.-Y. ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. _Bioinformatics_ 31, 2382–2383 (2015). CAS PubMed Google Scholar * Spurrell,
C. H. et al. Genome-wide fetalization of enhancer architecture in heart disease. Preprint at _bioRxiv_ https://doi.org/10.1101/591362 (2019). * Sharma, A., Toepfer, C. N., Schmid, M.,
Garfinkel, A. C. & Seidman, C. E. Differentiation and contractile analysis of GFP-sarcomere reporter hiPSC-cardiomyocytes. _Curr. Protoc. Hum. Genet._ 96, 21.12.1–21.12.12 (2018). CAS
Google Scholar * Shah, A., Qian, Y., Weyn-Vanhentenryck, S. M. & Zhang, C. CLIP Tool Kit (CTK): a flexible and robust pipeline to analyze CLIP sequencing data. _Bioinformatics_ 33,
566–567 (2017). CAS PubMed Google Scholar * Feng, H. et al. Modeling RNA-binding protein specificity in vivo by precisely registering protein-RNA crosslink sites. _Mol. Cell_ 74,
1189–1204.e6 (2019). CAS PubMed PubMed Central Google Scholar Download references ACKNOWLEDGEMENTS We are enormously grateful to the patients and families who participated in this
research. We thank the following for patient recruitment: A. Julian, M. MacNeal, Y. Mendez, T. Mendiz-Ramdeen and C. Mintz (Icahn School of Medicine at Mount Sinai); N. Cross (Yale School of
Medicine); J. Ellashek and N. Tran (Children’s Hospital of Los Angeles); B. McDonough, J. Geva and M. Borensztein (Harvard Medical School); K. Flack, L. Panesar and N. Taylor (University
College London); E. Taillie (University of Rochester School of Medicine and Dentistry); S. Edman, J. Garbarini, J. Tusi and S. Woyciechowski (Children’s Hospital of Philadelphia); D. Awad,
C. Breton, K. Celia, C. Duarte, D. Etwaru, N. Fishman, E. Griffin, M. Kaspakoval, J. Kline, R. Korsin, A. Lanz, E. Marquez, D. Queen, A. Rodriguez, J. Rose, J. K. Sond, D. Warburton, A.
Wilpers and R. Yee (Columbia Medical School); D. Gruber (Cohen Children’s Medical Center, Northwell Health). These data were generated by the PCGC, under the auspices of the Bench to
Bassinet Program (https://benchtobassinet.com) of the NHLBI. The results analyzed and published here are based in part on data generated by Gabriella Miller Kids First Pediatric Research
Program projects phs001138.v1.p2/phs001194.v1.p2, and were accessed from the Kids First Data Resource Portal (https://kidsfirstdrc.org/) and/or dbGaP (www.ncbi.nlm.nih.gov/gap). This
manuscript was prepared in collaboration with investigators of the PCGC and has been reviewed and/or approved by the PCGC. PCGC investigators are listed at
https://benchtobassinet.com/?page_id=119. This work was supported in part through the computational resources and staff expertise provided by Scientific Computing at the Icahn School of
Medicine at Mount Sinai. We are grateful to all of the families at the participating Simons Simplex Collection (SSC) sites, as well as the principal investigators (A. Beaudet, R. Bernier, J.
Constantino, E. Cook, E. Fombonne, D. Geschwind, R. Goin-Kochel, E. Hanson, D. Grice, A. Klin, D. Ledbetter, C. Lord, C. Martin, D. Martin, R. Maxim, J. Miles, O. Ousley, K. Pelphrey, B.
Peterson, J. Piggot, C. Saulnier, M. State, W. Stone, J. Sutcliffe, C. Walsh, Z. Warren and E. Wijsman). We appreciate the access obtained to phenotypic and/or genetic data on SFARI Base.
Approved researchers can obtain the SSC population dataset described in this study (https://www.sfari.org/resource/simons-simplex-collection) by applying at https://base.sfari.org. This work
was supported by the Mount Sinai Medical Scientist Training Program (5T32GM007280 to F.R.), National Institute of Dental and Craniofacial Research Interdisciplinary Training in Systems and
Developmental Biology and Birth Defects (T32HD075735 to F.R.), Harvard Medical School Epigenetic and Gene Dynamics Award (S.U.M. and C.E.S.), American Heart Association Post-Doctoral
Fellowship (S.U.M.), and Howard Hughes Medical Institute (C.E.S.). Research conducted at the E.O. Lawrence Berkeley National Laboratory was supported by National Institutes of Health (NIH)
grants (UM1HL098166 and R24HL123879) and performed under Department of Energy Contract DE-AC02-05CH11231, University of California. O.T. is a CIFAR fellow and this work was partially
supported by NIH grant R01GM071966. The PCGC program is funded by the NHLBI, NIH, US Department of Health and Human Services through grants UM1HL128711, UM1HL098162, UM1HL098147,
UM1HL098123, UM1HL128761 and U01HL131003. The PCGC Kids First study includes data sequenced by the Broad Institute (U24 HD090743-01). AUTHOR INFORMATION Author notes * These authors
contributed equally: Felix Richter, Sarah U. Morton, Seong Won Kim, Alexander Kitaygorodsky, Lauren K. Wasson, Kathleen M. Chen. * These authors jointly supervised this work: Deepak
Srivastava, Martin Tristani-Firouzi, Olga G. Troyanskaya, Diane E. Dickel, Yufeng Shen, Jonathan G. Seidman, Christine E. Seidman, Bruce D. Gelb. AUTHORS AND AFFILIATIONS * Graduate School
of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA Felix Richter & Kathryn B. Manheimer * Department of Pediatrics, Harvard Medical School, Boston, MA,
USA Sarah U. Morton * Division of Newborn Medicine, Boston Children’s Hospital, Boston, MA, USA Sarah U. Morton * Department of Genetics, Harvard Medical School, Boston, MA, USA Seong Won
Kim, Lauren K. Wasson, Steven R. DePalma, Michael Parfenov, Jason Homsy, Joshua M. Gorham, Jonathan G. Seidman & Christine E. Seidman * Departments of Systems Biology and Biomedical
Informatics, Columbia University, New York, NY, USA Alexander Kitaygorodsky, Hongjian Qi & Yufeng Shen * Flatiron Institute, Simons Foundation, New York, NY, USA Kathleen M. Chen, Jian
Zhou & Olga G. Troyanskaya * Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA Jian Zhou & Olga G. Troyanskaya * Lyda Hill Department of
Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, USA Jian Zhou * Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York,
NY, USA Nihir Patel, Eric E. Schadt & Bruce D. Gelb * Center for External Innovation, Takeda Pharmaceuticals USA, Cambridge, MA, USA Jason Homsy * Sema4, Stamford, CT, USA Kathryn B.
Manheimer & Eric E. Schadt * Department of Human Genetics, Utah Center for Genetic Discovery, University of Utah School of Medicine, Salt Lake City, UT, USA Matthew Velinder, Andrew
Farrell & Gabor Marth * Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA Eric E. Schadt * Heart Development and Structural
Diseases Branch, Division of Cardiovascular Sciences, NHLBI/NIH, Bethesda, MD, USA Jonathan R. Kaltman * Boston Children’s Hospital, Boston, MA, USA Jane W. Newburger * Cardiorespiratory
Unit, Great Ormond Street Hospital, London, UK Alessandro Giardini * Division of Cardiology, Children’s Hospital of Philadelphia, Philadelphia, PA, USA Elizabeth Goldmuntz * Department of
Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA Elizabeth Goldmuntz * Departments of Pediatrics and Genetics, Yale University School of
Medicine, New Haven, CT, USA Martina Brueckner * Children’s Hospital Los Angeles, Los Angeles, CA, USA Richard Kim * Department of Pediatrics, University of Rochester, Rochester, NY, USA
George A. Porter Jr. * Department of Pediatrics, Stanford University, Palo Alto, CA, USA Daniel Bernstein * Departments of Pediatrics and Medicine, Columbia University Medical Center, New
York, NY, USA Wendy K. Chung * Gladstone Institute of Cardiovascular Disease and University of California San Francisco, San Francisco, CA, USA Deepak Srivastava * Division of Pediatric
Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA Martin Tristani-Firouzi * Department of Computer Science, Princeton University, Princeton, NJ, USA Olga G.
Troyanskaya * Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Lab, Berkeley, CA, USA Diane E. Dickel * Department of Cardiology, Brigham and Women’s Hospital,
Boston, MA, USA Christine E. Seidman * Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA Bruce D. Gelb * Department of Pediatrics,
Icahn School of Medicine at Mount Sinai, New York, NY, USA Bruce D. Gelb Authors * Felix Richter View author publications You can also search for this author inPubMed Google Scholar * Sarah
U. Morton View author publications You can also search for this author inPubMed Google Scholar * Seong Won Kim View author publications You can also search for this author inPubMed Google
Scholar * Alexander Kitaygorodsky View author publications You can also search for this author inPubMed Google Scholar * Lauren K. Wasson View author publications You can also search for
this author inPubMed Google Scholar * Kathleen M. Chen View author publications You can also search for this author inPubMed Google Scholar * Jian Zhou View author publications You can also
search for this author inPubMed Google Scholar * Hongjian Qi View author publications You can also search for this author inPubMed Google Scholar * Nihir Patel View author publications You
can also search for this author inPubMed Google Scholar * Steven R. DePalma View author publications You can also search for this author inPubMed Google Scholar * Michael Parfenov View
author publications You can also search for this author inPubMed Google Scholar * Jason Homsy View author publications You can also search for this author inPubMed Google Scholar * Joshua M.
Gorham View author publications You can also search for this author inPubMed Google Scholar * Kathryn B. Manheimer View author publications You can also search for this author inPubMed
Google Scholar * Matthew Velinder View author publications You can also search for this author inPubMed Google Scholar * Andrew Farrell View author publications You can also search for this
author inPubMed Google Scholar * Gabor Marth View author publications You can also search for this author inPubMed Google Scholar * Eric E. Schadt View author publications You can also
search for this author inPubMed Google Scholar * Jonathan R. Kaltman View author publications You can also search for this author inPubMed Google Scholar * Jane W. Newburger View author
publications You can also search for this author inPubMed Google Scholar * Alessandro Giardini View author publications You can also search for this author inPubMed Google Scholar *
Elizabeth Goldmuntz View author publications You can also search for this author inPubMed Google Scholar * Martina Brueckner View author publications You can also search for this author
inPubMed Google Scholar * Richard Kim View author publications You can also search for this author inPubMed Google Scholar * George A. Porter Jr. View author publications You can also search
for this author inPubMed Google Scholar * Daniel Bernstein View author publications You can also search for this author inPubMed Google Scholar * Wendy K. Chung View author publications You
can also search for this author inPubMed Google Scholar * Deepak Srivastava View author publications You can also search for this author inPubMed Google Scholar * Martin Tristani-Firouzi
View author publications You can also search for this author inPubMed Google Scholar * Olga G. Troyanskaya View author publications You can also search for this author inPubMed Google
Scholar * Diane E. Dickel View author publications You can also search for this author inPubMed Google Scholar * Yufeng Shen View author publications You can also search for this author
inPubMed Google Scholar * Jonathan G. Seidman View author publications You can also search for this author inPubMed Google Scholar * Christine E. Seidman View author publications You can
also search for this author inPubMed Google Scholar * Bruce D. Gelb View author publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS F.R., S.U.M., S.W.K.,
A.K., L.K.W., K.M.C., J.R.K., O.G.T., D.E.D., Y.S., J.G.S., C.E.S. and B.D.G. conceived and designed the experiments/analyses. J.R.K., J.W.N., A.G., E.G., M.B., R.K., G.A.P., D.B., W.K.C.,
D.S., M.T.-F., J.G.S., C.E.S. and B.D.G. contributed to cohort ascertainment, phenotypic characterization and recruitment. F.R., S.U.M., A.K., H.Q., N.P., S.R.D., M.P., J.H., J.M.G., K.B.M.,
M.V., A.F., G.M., W.K.C., Y.S., J.G.S., C.E.S. and B.D.G. contributed to whole-genome sequencing production, validation and analysis. F.R., S.U.M., A.K., K.M.C., H.Q., E.E.S., O.G.T., Y.S.,
J.G.S., C.E.S. and B.D.G. contributed to statistical analyses. F.R., K.M.C., J.Z., O.G.T. and B.D.G. developed the HeartENN model. S.U.M., S.W.K., L.K.W., D.E.D., J.G.S. and C.E.S.
generated and analyzed fetal heart and iPSC data. F.R., S.U.M., S.W.K., A.K., L.K.W., K.M.C., Y.S., J.G.S., C.E.S. and B.D.G. wrote and reviewed the manuscript. All authors read and approved
the manuscript. CORRESPONDING AUTHOR Correspondence to Bruce D. Gelb. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no competing interests. ADDITIONAL INFORMATION PUBLISHER’S
NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. EXTENDED DATA EXTENDED DATA FIG. 1 OTHER PIPELINES IDENTIFIED 94%
OF DNVS IN CONTROL TRIOS. Overlaps with DNVs identified in 1,470 control trios with two other pipelines9,10. Of note, a third analysis of these trios did not include _de novo_ calls42. For
consistency with other pipelines, only SNVs were included and variants in LCRs, blacklists, segmental duplications, and repeats were excluded. Together, 94% of _de novo_ SNVs were called by
at least one other pipeline. EXTENDED DATA FIG. 2 CORRELATION BETWEEN PARENTAL AGE AT PROBAND BIRTH AND DNVS/TRIO. Multiple linear regression (_β_paternal_age_x_ + _β_maternal_age_x_ +
_β_intercept + _ε_) was fitted on 763 CHD and 1,611 unaffected individuals to calculate the associations between paternal and maternal age for SNVs, indels, and combined. Regression
coefficients and _P_-values are shown, uncorrected for multiple hypotheses. Sequencing metric comparisons between the centers, colored by cases (_n_ = 763) and controls (_n_ = 1,611), found
moderate bias in DNV quantity, so the background statistical parameter throughout the manuscript is total number of DNVs. Box plots show medians and interquartile ranges. EXTENDED DATA FIG.
3 _DE NOVO_ VARIANT (DNV) CHD-UNAFFECTED BURDEN. The number of DNVs in 184 noncoding annotations (points) genome-wide and within 10 kb of TSSs for 6 gene sets (facets) was counted in CHD
(_n_ = 749) and Simons unaffected (_n_ = 1,611) individuals. The _P_ value threshold (1.5 x 10-4, horizontal blue line) is 0.05 divided by the product of the number of effective annotations
(_n_ = 47) and number of gene sets (_n_ = 7). The _P_ value (_y_-axis) was calculated with a two-sided Fisher’s exact test, the odds ratio (_x_-axis) was DNVsannotation,CHD/DNVstotal,CHD vs.
DNVsannotation,unaffected/DNVstotal, unaffected. No annotations surpassed the _P_ value threshold. CHD, congenital heart disease; HHE, high heart expression. EXTENDED DATA FIG. 4 HEARTENN
PERFORMANCE WAS COMPARABLE TO DEEPSEA. HearENN ROC AUC mean = 0.93 and AUPRC mean = 0.34. ROC AUC, receiver operator characteristics area under the curve; AUPRC, area under the precision
recall curve. EXTENDED DATA FIG. 5 DETERMINING AN ABSOLUTE FUNCTIONAL DIFFERENCE SCORE RANGE. A, Comparison of HGMD disease mutations (blue, _n_ = 1,564) and polymorphism (gray, _n_ = 642)
DeepSEA absolute functional difference scores at varying functional cut-offs illustrates a similar distribution and functionally impactful range ≥0.1 (arrow) for disease mutations. No
statistical significance testing was performed. B, The similarity of null distributions for DeepSEA (gray, downsampled to 184 features) and HeartENN (heart) HGMD polymorphism scores
suggested that the DeepSEA functional score range was also applicable to HeartENN (gray and red _n_ = 642). Scores of 0 set off to left (as 10-4). EXTENDED DATA FIG. 6 SUPPORT FOR HEARTENN ≥
0.1 FUNCTIONAL RANKING. For all DNVs (_n_ = 170,171), overlap between HeartENN ≥0.1 (_n_ = 6,415) and other noncoding scores was assessed with a two-sided Fisher’s exact test (left panel).
Case–control burden for these other noncoding scores (right panel) was statistically significant for CADD ≥15 (_P__Bonferroni_ = 0.019) with a two-sided Fisher’s exact test (cases _n_ =
56,164 and controls _n_ = 114,065). For both panels, unadjusted _P_-values are tabulated, and red indicates a Benjamini-Hochberg-adjusted _P_ value false discovery rate (FDR) < 0.05.
EXTENDED DATA FIG. 7 RELATIONSHIP BETWEEN SEQUENCE LENGTH INSERTED INTO THE PMRPA1 PLASMID AND THE TRANSCRIPT READS/PLASMID COPIES IN MPRAS. The length of the sequences inserted into the
pMPRA1 plasmid (_x_-axis) ranged from 300 to 1,600 bp. After transfection of four libraries (color coded as per key) into the iPSC–CMs, the resulting ratios of transcript reads (mRNA) per
plasmid copies (DNA) are graphed on the _y_-axis, showing no systematic relationship between insert length and transcriptional level. EXTENDED DATA FIG. 8 DNVS WITH A TREND TOWARDS DECREASED
EXPRESSION BY MPRA ASSAY. Box plots for two DNVs for which two MPRA replicates were significantly different but overall statistical significance across all replicates was not attained.
Boxplots show the median fold change (FC), first and third quartiles (lower and upper hinges), and range of values (whiskers and outlying points). Statistical significance was assessed with
two-sided _t_-test Benjamini-Hochberg-adjusted _P_-values. Each boxplot has at least 3 independent experiments with 4 technical replicates each. EXTENDED DATA FIG. 9 FRACTION OF DNVS IN EACH
OF THE CANONICAL VARIANT CLASSES. The fraction was calculated separately within CHD and unaffected subjects for each of the three methods (including overlaps) and the total number of
variants in each group (right table). EXTENDED DATA FIG. 10 DNV ENRICHMENT IN PHENOTYPE SUBGROUPS. A, Enrichment of DNVs with predicted functional impacts (score ≥0.1) for HeartENN (left)
and DeepSEA (right) within phenotype subgroups. B, Enrichment of _de novo_ SNVs with H3K36me3 marks implicated in RNA-binding protein disruption in different subgroups for the most
significant (left) and highest effect size (right) hits. Both A and B were performed with a two-sided Fisher’s exact test (unadjusted _P_-values and 95% C.I.s shown) comparing the fraction
of DNVs in each subgroup (HeartENN ≥ 0.1, DeepSEA ≥ 0.1, etc.) to the same control cohort. For HeartENN, there were _n_ = 4,177 control DNVs with HeartENN ≥ 0.1 and _n_ = 109,888 control
DNVs with HeartENN < 0.1. NDD, neurodevelopmental disorder; ECA, extracardiac anomaly. SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION Supplementary Note and Fig. 1 REPORTING SUMMARY
SUPPLEMENTARY TABLE Supplementary Tables 1–16 RIGHTS AND PERMISSIONS Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Richter, F., Morton, S.U., Kim, S.W. _et al._ Genomic
analyses implicate noncoding de novo variants in congenital heart disease. _Nat Genet_ 52, 769–777 (2020). https://doi.org/10.1038/s41588-020-0652-z Download citation * Received: 09 March
2019 * Accepted: 22 May 2020 * Published: 29 June 2020 * Issue Date: August 2020 * DOI: https://doi.org/10.1038/s41588-020-0652-z SHARE THIS ARTICLE Anyone you share the following link with
will be able to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt
content-sharing initiative