Genomic analyses implicate noncoding de novo variants in congenital heart disease

Genomic analyses implicate noncoding de novo variants in congenital heart disease


Play all audios:

Loading...

ABSTRACT A genetic etiology is identified for one-third of patients with congenital heart disease (CHD), with 8% of cases attributable to coding de novo variants (DNVs). To assess the


contribution of noncoding DNVs to CHD, we compared genome sequences from 749 CHD probands and their parents with those from 1,611 unaffected trios. Neural network prediction of noncoding DNV


transcriptional impact identified a burden of DNVs in individuals with CHD (_n_ = 2,238 DNVs) compared to controls (_n_ = 4,177; _P_ = 8.7 × 10−4). Independent analyses of enhancers showed


an excess of DNVs in associated genes (27 genes versus 3.7 expected, _P_ = 1 × 10−5). We observed significant overlap between these transcription-based approaches (odds ratio (OR) = 2.5, 95%


confidence interval (CI) 1.1–5.0, _P_ = 5.4 × 10−3). CHD DNVs altered transcription levels in 5 of 31 enhancers assayed. Finally, we observed a DNV burden in RNA-binding-protein regulatory


sites (OR = 1.13, 95% CI 1.1–1.2, _P_ = 8.8 × 10−5). Our findings demonstrate an enrichment of potentially disruptive regulatory noncoding DNVs in a fraction of CHD at least as high as that


observed for damaging coding DNVs. Access through your institution Buy or subscribe This is a preview of subscription content, access via your institution ACCESS OPTIONS Access through your


institution Access Nature and 54 other Nature Portfolio journals Get Nature+, our best-value online-access subscription $32.99 / 30 days cancel any time Learn more Subscribe to this journal


Receive 12 print issues and online access $209.00 per year only $17.42 per issue Learn more Buy this article * Purchase on SpringerLink * Instant access to full article PDF Buy now Prices


may be subject to local taxes which are calculated during checkout ADDITIONAL ACCESS OPTIONS: * Log in * Learn about institutional subscriptions * Read our FAQs * Contact customer support


SIMILAR CONTENT BEING VIEWED BY OTHERS FUNCTIONAL DISSECTION OF HUMAN CARDIAC ENHANCERS AND NONCODING DE NOVO VARIANTS IN CONGENITAL HEART DISEASE Article 20 February 2024 COPY NUMBER


VARIATION-ASSOCIATED LNCRNAS MAY CONTRIBUTE TO THE ETIOLOGIES OF CONGENITAL HEART DISEASE Article Open access 17 February 2023 GENOMIC FRONTIERS IN CONGENITAL HEART DISEASE Article 16 July


2021 DATA AVAILABILITY Whole-genome sequencing data are deposited in the database of Genotypes and Phenotypes (dbGaP) under accession numbers phs001194.v2.p2 and phs001138.v2.p2. CODE


AVAILABILITY Documentation, links, and availability of source code and select supplementary data are detailed at https://github.com/frichter/wgs_chd_analysis. The DNV identification pipeline


is available at https://github.com/ShenLab/igv-classifier and https://github.com/frichter/dnv_pipeline. The HeartENN algorithmic framework is available at


https://github.com/FunctionLab/selene/archive/0.4.8.tar.gz. HeartENN model weights and scripts for burden tests are available at https://github.com/frichter/wgs_chd_analysis. All source code


is distributed under the Massachusetts Institute of Technology license. REFERENCES * van der Linde, D. et al. Birth prevalence of congenital heart disease worldwide. _J. Am. Coll. Cardiol._


58, 2241–2247 (2011). PubMed  Google Scholar  * Pediatric Cardiac Genomics Consortium et al.The Congenital Heart Disease Genetic Network Study: rationale, design, and early results. _Circ.


Res._ 112, 698–706 (2013). PubMed Central  Google Scholar  * Zaidi, S. et al. De novo mutations in histone-modifying genes in congenital heart disease. _Nature_ 498, 220–223 (2013). CAS 


PubMed  PubMed Central  Google Scholar  * Homsy, J. et al. De novo mutations in congenital heart disease with neurodevelopmental and other congenital anomalies. _Science_ 350, 1262–1266


(2015). CAS  PubMed  PubMed Central  Google Scholar  * Jin, S. C. et al. Contribution of rare inherited and de novo variants in 2,871 congenital heart disease probands. _Nat. Genet._ 49,


1593–1601 (2017). CAS  PubMed  PubMed Central  Google Scholar  * Fischbach, G. D. & Lord, C. The Simons Simplex Collection: a resource for identification of autism genetic risk factors.


_Neuron_ 68, 192–195 (2010). CAS  PubMed  Google Scholar  * Garrison, E. & Marth, G. Haplotype-based variant detection from short-read sequencing. Preprint at


https://arxiv.org/abs/1207.3907v2 (2012). * Richter, F. et al. Whole genome de novo variant identification with FreeBayes and neural network approaches. Preprint at _bioRxiv_


https://doi.org/10.1101/2020.03.24.994160 (2020). * Zhou, J. et al. Whole-genome deep-learning analysis identifies contribution of noncoding mutations to autism risk. _Nat. Genet._ 51,


973–980 (2019). CAS  PubMed  PubMed Central  Google Scholar  * An, J.-Y. et al. Genome-wide de novo risk score implicates promoter variation in autism spectrum disorder. _Science_ 362,


eaat6576 (2018). PubMed  PubMed Central  Google Scholar  * Jónsson, H. et al. Parental influence on human germline de novo mutations in 1,548 trios from Iceland. _Nature_ 549, 519–522


(2017). PubMed  Google Scholar  * Goldmann, J. M. et al. Parent-of-origin-specific signatures of de novo mutations. _Nat. Genet._ 48, 935–939 (2016). CAS  PubMed  Google Scholar  * Seiden,


A. H. et al. Elucidation of de novo small insertion/deletion biology with parent-of-origin phasing. _Hum. Mutat._ 41, 800–806 (2020). CAS  PubMed  PubMed Central  Google Scholar  * Kent, W.


J. et al. The human genome browser at UCSC. _Genome Res._ 12, 996–1006 (2002). CAS  PubMed  PubMed Central  Google Scholar  * Bernstein, B. E. et al. An integrated encyclopedia of DNA


elements in the human genome. _Nature_ 489, 57–74 (2012). Google Scholar  * Mei, S. et al. Cistrome Data Browser: a data portal for ChIP–Seq and chromatin accessibility data in human and


mouse. _Nucleic Acids Res._ 45, D658–D662 (2017). CAS  PubMed  Google Scholar  * He, A. et al. Dynamic GATA4 enhancers shape the chromatin landscape central to heart development and disease.


_Nat. Commun._ 5, 4907 (2014). CAS  PubMed  Google Scholar  * Sayed, D., Yang, Z., He, M., Pfleger, J. M. & Abdellatif, M. Acute targeting of general transcription factor IIB restricts


cardiac hypertrophy via selective inhibition of gene transcription. _Circ. Heart Fail._ 8, 138–148 (2015). CAS  PubMed  Google Scholar  * Stefanovic, S. et al. GATA-dependent regulatory


switches establish atrioventricular canal specificity during heart development. _Nat. Commun._ 5, 3680 (2014). PubMed  Google Scholar  * Sayed, D., He, M., Yang, Z., Lin, L. &


Abdellatif, M. Transcriptional regulation patterns revealed by high resolution chromatin immunoprecipitation during cardiac hypertrophy. _J. Biol. Chem._ 288, 2546–2558 (2013). CAS  PubMed 


Google Scholar  * Zhang, L. et al. KLF15 establishes the landscape of diurnal expression in the heart. _Cell Rep._ 13, 2368–2375 (2015). CAS  PubMed  Google Scholar  * Anand, P. et al. BET


bromodomains mediate transcriptional pause release in heart failure. _Cell_ 154, 569–582 (2013). CAS  PubMed  PubMed Central  Google Scholar  * Attanasio, C. et al. Tissue-specific SMARCA4


binding at active and repressed regulatory elements during embryogenesis. _Genome Res._ 24, 920–929 (2014). CAS  PubMed  PubMed Central  Google Scholar  * Sakabe, N. J. et al. Dual


transcriptional activator and repressor roles of TBX20 regulate adult cardiac structure and function. _Hum. Mol. Genet._ 21, 2194–2204 (2012). CAS  PubMed  PubMed Central  Google Scholar  *


Consortium, R. E. et al. Integrative analysis of 111 reference human epigenomes. _Nature_ 518, 317–330 (2015). Google Scholar  * May, D. et al. Large-scale discovery of enhancers from human


heart tissue. _Nat. Genet._ 44, 89–93 (2012). CAS  Google Scholar  * Dickel, D. E. et al. Genome-wide compendium and functional assessment of in vivo heart enhancers. _Nat. Commun._ 7, 12923


(2016). CAS  PubMed  PubMed Central  Google Scholar  * Nord, A. S. et al. Rapid and pervasive changes in genome-wide enhancer usage during mammalian development. _Cell_ 155, 1521–1531


(2013). CAS  PubMed  PubMed Central  Google Scholar  * Blow, M. J. et al. ChIP–Seq identification of weakly conserved heart enhancers. _Nat. Genet._ 42, 806–810 (2010). CAS  PubMed  PubMed


Central  Google Scholar  * Yue, F. et al. A comparative encyclopedia of DNA elements in the mouse genome. _Nature_ 515, 355–364 (2014). CAS  PubMed  PubMed Central  Google Scholar  * Shen,


Y. et al. A map of the _cis_-regulatory sequences in the mouse genome. _Nature_ 488, 116–120 (2012). CAS  PubMed  PubMed Central  Google Scholar  * van den Boogaard, M. et al. Genetic


variation in T-box binding element functionally affects _SCN5A_/_SCN10A_ enhancer. _J. Clin. Invest._ 122, 2519–2530 (2012). PubMed  PubMed Central  Google Scholar  * Zhou, J. &


Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning–based sequence model. _Nat. Methods_ 12, 931–934 (2015). CAS  PubMed  PubMed Central  Google Scholar  * Huang,


Y.-F., Gulko, B. & Siepel, A. Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data. _Nat. Genet._ 49, 618–624 (2017). CAS  PubMed 


PubMed Central  Google Scholar  * Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. _Nat. Genet._ 46, 310–315 (2014). CAS  PubMed 


PubMed Central  Google Scholar  * Rentzsch, P., Witten, D., Cooper, G. M., Shendure, J. & Kircher, M. CADD: predicting the deleteriousness of variants throughout the human genome.


_Nucleic Acids Res._ 47, D886–D894 (2019). CAS  PubMed  Google Scholar  * Davydov, E. V. et al. Identifying a high fraction of the human genome to be under selective constraint using GERP++.


_PLoS Comput. Biol._ 6, e1001025 (2010). PubMed  PubMed Central  Google Scholar  * Ritchie, G. R. S., Dunham, I., Zeggini, E. & Flicek, P. Functional annotation of noncoding sequence


variants. _Nat. Methods_ 11, 294–296 (2014). CAS  PubMed  PubMed Central  Google Scholar  * Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. _Nature_ 536,


285–291 (2016). CAS  PubMed  PubMed Central  Google Scholar  * Melnikov, A., Zhang, X., Rogov, P., Wang, L. & Mikkelsen, T. S. Massively parallel reporter assays in cultured mammalian


cells. _J. Vis. Exp._ https://doi.org/10.3791/51719 (2014). Article  PubMed  PubMed Central  Google Scholar  * Werling, D. M. et al. An analytical framework for whole-genome sequence


association studies and its implications for autism spectrum disorder. _Nat. Genet._ 50, 727–736 (2018). CAS  PubMed  PubMed Central  Google Scholar  * Turner, T. N. et al. Genomic patterns


of de novo mutation in simplex autism. _Cell_ 171, 710–722.e12 (2017). CAS  PubMed  PubMed Central  Google Scholar  * C Yuen, R. K. et al. Whole genome sequencing resource identifies 18 new


candidate genes for autism spectrum disorder. _Nat. Neurosci._ 20, 602–611 (2017). PubMed  Google Scholar  * Hamdan, F. F. et al. High rate of recurrent de novo mutations in developmental


and epileptic encephalopathies. _Am. J. Hum. Genet._ 101, 664–685 (2017). CAS  PubMed  PubMed Central  Google Scholar  * Peacock, J. D., Lu, Y., Koch, M., Kadler, K. E. & Lincoln, J.


Temporal and spatial expression of collagens during murine atrioventricular heart valve development and maintenance. _Dev. Dyn._ 237, 3051–3058 (2008). PubMed  PubMed Central  Google Scholar


  * Kurosaka, S. et al. Arginylation regulates myofibrils to maintain heart function and prevent dilated cardiomyopathy. _J. Mol. Cell. Cardiol._ 53, 333–341 (2012). CAS  PubMed  PubMed


Central  Google Scholar  * Kleffmann, W. et al. 5q31 microdeletions: definition of a critical region and analysis of _LRRTM2_, a candidate gene for intellectual disability. _Mol. Syndromol._


3, 68–75 (2012). CAS  PubMed  PubMed Central  Google Scholar  * Mehta, G. et al. MITF interacts with the SWI/SNF subunit, BRG1, to promote GATA4 expression in cardiac hypertrophy. _J. Mol.


Cell. Cardiol._ 88, 101–110 (2015). CAS  PubMed  PubMed Central  Google Scholar  * Tshori, S. et al. Transcription factor MITF regulates cardiac growth and hypertrophy. _J. Clin. Invest._


116, 2673–2681 (2006). CAS  PubMed  PubMed Central  Google Scholar  * Nicholson, T. B. et al. A hypomorphic lsd1 allele results in heart development defects in mice. _PLoS One_ 8, e60913


(2013). CAS  PubMed  PubMed Central  Google Scholar  * Hamidi, T. et al. Identification of Rpl29 as a major substrate of the lysine methyltransferase Set7/9. _J. Biol. Chem._ 293,


12770–12780 (2018). CAS  PubMed  PubMed Central  Google Scholar  * Siggs, O. M. et al. Mutation of _Fnip1_ is associated with B-cell deficiency, cardiomyopathy, and elevated AMPK activity.


_Proc. Natl Acad. Sci. USA_ 113, E3706–E3715 (2016). CAS  PubMed  PubMed Central  Google Scholar  * Chen, C.-Y. et al. Accumulation of the inner nuclear envelope protein Sun1 is pathogenic


in progeric and dystrophic laminopathies. _Cell_ 149, 565–577 (2012). CAS  PubMed  PubMed Central  Google Scholar  * Meinke, P. et al. Muscular dystrophy-associated _SUN1_ and _SUN2_


variants disrupt nuclear-cytoskeletal connections and myonuclear organization. _PLoS Genet._ 10, e1004605 (2014). PubMed  PubMed Central  Google Scholar  * Röseler, S. et al. Lethal


phenotype of mice carrying a _Sept11_ null mutation. _Biol. Chem._ 392, 779–781 (2011). PubMed  Google Scholar  * Guo, A. et al. E–C coupling structural protein junctophilin-2 encodes a


stress-adaptive transcription regulator. _Science_ 362, eaan3303 (2018). CAS  PubMed  PubMed Central  Google Scholar  * Yamagishi, H. et al. A history and interaction of outflow progenitor


cells implicated in “Takao Syndrome.” In _Etiology and Morphogenesis of Congenital Heart Disease: From Gene Function and Cellular Interaction to Morphology_ (eds. Nakanishi, T. et al.)


201–209 (Springer, 2016). * Masuda, T. & Taniguchi, M. Congenital diseases and semaphorin signaling: overview to date of the evidence linking them. _Congenit. Anom. (Kyoto)._ 55, 26–30


(2015). CAS  PubMed  Google Scholar  * Pierpont, M. E. et al. Genetic basis for congenital heart disease: revisited: a scientific statement from the American Heart Association. _Circulation_


138, e653–e711 (2018). PubMed  PubMed Central  Google Scholar  * Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. _Bioinformatics_ 25,


1754–1760 (2009). CAS  PubMed  PubMed Central  Google Scholar  * McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.


_Genome Res._ 20, 1297–1303 (2010). CAS  PubMed  PubMed Central  Google Scholar  * DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA


sequencing data. _Nat. Genet._ 43, 491–498 (2011). CAS  PubMed  PubMed Central  Google Scholar  * Van der Auwera, G. et al. From FastQ data to high‐confidence variant calls: the genome


analysis toolkit best practices pipeline. _Curr. Protoc. Bioinformatics_ 43, 11.10.1–11.10.33 (2013). Google Scholar  * Kim, B.-Y., Park, J. H., Jo, H.-Y., Koo, S. K. & Park, M.-H.


Optimized detection of insertions/deletions (INDELs) in whole-exome sequencing data. _PLoS One_ 12, e0182272 (2017). PubMed  PubMed Central  Google Scholar  * Bailey, J. A., Yavor, A. M.,


Massa, H. F., Trask, B. J. & Eichler, E. E. Segmental duplications: organization and impact within the current human genome project assembly. _Genome Res._ 11, 1005–1017 (2001). CAS 


PubMed  PubMed Central  Google Scholar  * Derrien, T. et al. Fast computation and applications of genome mappability. _PLoS One_ 7, e30377 (2012). CAS  PubMed  PubMed Central  Google Scholar


  * Li, H. Toward better understanding of artifacts in variant calling from high-coverage samples. _Bioinformatics_ 30, 2843–2851 (2014). CAS  PubMed  PubMed Central  Google Scholar  *


Ostrander, B. E. P. et al. Whole-genome analysis for effective clinical diagnosis and gene discovery in early infantile epileptic encephalopathy. _NPJ Genom. Med._ 3, 22 (2018). PubMed 


PubMed Central  Google Scholar  * Blake, J. A. et al. Mouse Genome Database (MGD)-2017: community knowledge resource for the laboratory mouse. _Nucleic Acids Res._ 45, D723–D729 (2017). CAS


  PubMed  Google Scholar  * Chen, K. M., Cofer, E. M., Zhou, J. & Troyanskaya, O. G. et al. Selene: a PyTorch-based deep learning library for sequence data. _Nat. Methods_ 16, 315–318


(2019). CAS  PubMed  PubMed Central  Google Scholar  * Price, A. L. et al. Pooled association tests for rare variants in exon-resequencing studies. _Am. J. Hum. Genet._ 86, 832–838 (2010).


PubMed  PubMed Central  Google Scholar  * Lian, X. et al. Directed cardiomyocyte differentiation from human pluripotent stem cells by modulating Wnt/β-catenin signaling under fully defined


conditions. _Nat. Protoc._ 8, 162–175 (2013). CAS  PubMed  Google Scholar  * Buenrostro, J. D., Wu, B., Chang, H. Y. & Greenleaf, W. J. ATAC–seq: a method for assaying chromatin


accessibility genome-wide. _Curr. Protoc. Mol. Biol._ 109, 21.29.1–21.29.9 (2015). Google Scholar  * Corces, M. R. et al. An improved ATAC–seq protocol reduces background and enables


interrogation of frozen tissues. _Nat. Methods_ 14, 959–962 (2017). CAS  PubMed  PubMed Central  Google Scholar  * Heinz, S. et al. Simple combinations of lineage-determining transcription


factors prime _cis_-regulatory elements required for macrophage and B cell identities. _Mol. Cell_ 38, 576–589 (2010). CAS  PubMed  PubMed Central  Google Scholar  * Yu, G., Wang, L.-G.


& He, Q.-Y. ChIPseeker: an R/Bioconductor package for ChIP peak annotation, comparison and visualization. _Bioinformatics_ 31, 2382–2383 (2015). CAS  PubMed  Google Scholar  * Spurrell,


C. H. et al. Genome-wide fetalization of enhancer architecture in heart disease. Preprint at _bioRxiv_ https://doi.org/10.1101/591362 (2019). * Sharma, A., Toepfer, C. N., Schmid, M.,


Garfinkel, A. C. & Seidman, C. E. Differentiation and contractile analysis of GFP-sarcomere reporter hiPSC-cardiomyocytes. _Curr. Protoc. Hum. Genet._ 96, 21.12.1–21.12.12 (2018). CAS 


Google Scholar  * Shah, A., Qian, Y., Weyn-Vanhentenryck, S. M. & Zhang, C. CLIP Tool Kit (CTK): a flexible and robust pipeline to analyze CLIP sequencing data. _Bioinformatics_ 33,


566–567 (2017). CAS  PubMed  Google Scholar  * Feng, H. et al. Modeling RNA-binding protein specificity in vivo by precisely registering protein-RNA crosslink sites. _Mol. Cell_ 74,


1189–1204.e6 (2019). CAS  PubMed  PubMed Central  Google Scholar  Download references ACKNOWLEDGEMENTS We are enormously grateful to the patients and families who participated in this


research. We thank the following for patient recruitment: A. Julian, M. MacNeal, Y. Mendez, T. Mendiz-Ramdeen and C. Mintz (Icahn School of Medicine at Mount Sinai); N. Cross (Yale School of


Medicine); J. Ellashek and N. Tran (Children’s Hospital of Los Angeles); B. McDonough, J. Geva and M. Borensztein (Harvard Medical School); K. Flack, L. Panesar and N. Taylor (University


College London); E. Taillie (University of Rochester School of Medicine and Dentistry); S. Edman, J. Garbarini, J. Tusi and S. Woyciechowski (Children’s Hospital of Philadelphia); D. Awad,


C. Breton, K. Celia, C. Duarte, D. Etwaru, N. Fishman, E. Griffin, M. Kaspakoval, J. Kline, R. Korsin, A. Lanz, E. Marquez, D. Queen, A. Rodriguez, J. Rose, J. K. Sond, D. Warburton, A.


Wilpers and R. Yee (Columbia Medical School); D. Gruber (Cohen Children’s Medical Center, Northwell Health). These data were generated by the PCGC, under the auspices of the Bench to


Bassinet Program (https://benchtobassinet.com) of the NHLBI. The results analyzed and published here are based in part on data generated by Gabriella Miller Kids First Pediatric Research


Program projects phs001138.v1.p2/phs001194.v1.p2, and were accessed from the Kids First Data Resource Portal (https://kidsfirstdrc.org/) and/or dbGaP (www.ncbi.nlm.nih.gov/gap). This


manuscript was prepared in collaboration with investigators of the PCGC and has been reviewed and/or approved by the PCGC. PCGC investigators are listed at


https://benchtobassinet.com/?page_id=119. This work was supported in part through the computational resources and staff expertise provided by Scientific Computing at the Icahn School of


Medicine at Mount Sinai. We are grateful to all of the families at the participating Simons Simplex Collection (SSC) sites, as well as the principal investigators (A. Beaudet, R. Bernier, J.


Constantino, E. Cook, E. Fombonne, D. Geschwind, R. Goin-Kochel, E. Hanson, D. Grice, A. Klin, D. Ledbetter, C. Lord, C. Martin, D. Martin, R. Maxim, J. Miles, O. Ousley, K. Pelphrey, B.


Peterson, J. Piggot, C. Saulnier, M. State, W. Stone, J. Sutcliffe, C. Walsh, Z. Warren and E. Wijsman). We appreciate the access obtained to phenotypic and/or genetic data on SFARI Base.


Approved researchers can obtain the SSC population dataset described in this study (https://www.sfari.org/resource/simons-simplex-collection) by applying at https://base.sfari.org. This work


was supported by the Mount Sinai Medical Scientist Training Program (5T32GM007280 to F.R.), National Institute of Dental and Craniofacial Research Interdisciplinary Training in Systems and


Developmental Biology and Birth Defects (T32HD075735 to F.R.), Harvard Medical School Epigenetic and Gene Dynamics Award (S.U.M. and C.E.S.), American Heart Association Post-Doctoral


Fellowship (S.U.M.), and Howard Hughes Medical Institute (C.E.S.). Research conducted at the E.O. Lawrence Berkeley National Laboratory was supported by National Institutes of Health (NIH)


grants (UM1HL098166 and R24HL123879) and performed under Department of Energy Contract DE-AC02-05CH11231, University of California. O.T. is a CIFAR fellow and this work was partially


supported by NIH grant R01GM071966. The PCGC program is funded by the NHLBI, NIH, US Department of Health and Human Services through grants UM1HL128711, UM1HL098162, UM1HL098147,


UM1HL098123, UM1HL128761 and U01HL131003. The PCGC Kids First study includes data sequenced by the Broad Institute (U24 HD090743-01). AUTHOR INFORMATION Author notes * These authors


contributed equally: Felix Richter, Sarah U. Morton, Seong Won Kim, Alexander Kitaygorodsky, Lauren K. Wasson, Kathleen M. Chen. * These authors jointly supervised this work: Deepak


Srivastava, Martin Tristani-Firouzi, Olga G. Troyanskaya, Diane E. Dickel, Yufeng Shen, Jonathan G. Seidman, Christine E. Seidman, Bruce D. Gelb. AUTHORS AND AFFILIATIONS * Graduate School


of Biomedical Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA Felix Richter & Kathryn B. Manheimer * Department of Pediatrics, Harvard Medical School, Boston, MA,


USA Sarah U. Morton * Division of Newborn Medicine, Boston Children’s Hospital, Boston, MA, USA Sarah U. Morton * Department of Genetics, Harvard Medical School, Boston, MA, USA Seong Won


Kim, Lauren K. Wasson, Steven R. DePalma, Michael Parfenov, Jason Homsy, Joshua M. Gorham, Jonathan G. Seidman & Christine E. Seidman * Departments of Systems Biology and Biomedical


Informatics, Columbia University, New York, NY, USA Alexander Kitaygorodsky, Hongjian Qi & Yufeng Shen * Flatiron Institute, Simons Foundation, New York, NY, USA Kathleen M. Chen, Jian


Zhou & Olga G. Troyanskaya * Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA Jian Zhou & Olga G. Troyanskaya * Lyda Hill Department of


Bioinformatics, University of Texas Southwestern Medical Center, Dallas, TX, USA Jian Zhou * Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York,


NY, USA Nihir Patel, Eric E. Schadt & Bruce D. Gelb * Center for External Innovation, Takeda Pharmaceuticals USA, Cambridge, MA, USA Jason Homsy * Sema4, Stamford, CT, USA Kathryn B.


Manheimer & Eric E. Schadt * Department of Human Genetics, Utah Center for Genetic Discovery, University of Utah School of Medicine, Salt Lake City, UT, USA Matthew Velinder, Andrew


Farrell & Gabor Marth * Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA Eric E. Schadt * Heart Development and Structural


Diseases Branch, Division of Cardiovascular Sciences, NHLBI/NIH, Bethesda, MD, USA Jonathan R. Kaltman * Boston Children’s Hospital, Boston, MA, USA Jane W. Newburger * Cardiorespiratory


Unit, Great Ormond Street Hospital, London, UK Alessandro Giardini * Division of Cardiology, Children’s Hospital of Philadelphia, Philadelphia, PA, USA Elizabeth Goldmuntz * Department of


Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA Elizabeth Goldmuntz * Departments of Pediatrics and Genetics, Yale University School of


Medicine, New Haven, CT, USA Martina Brueckner * Children’s Hospital Los Angeles, Los Angeles, CA, USA Richard Kim * Department of Pediatrics, University of Rochester, Rochester, NY, USA


George A. Porter Jr. * Department of Pediatrics, Stanford University, Palo Alto, CA, USA Daniel Bernstein * Departments of Pediatrics and Medicine, Columbia University Medical Center, New


York, NY, USA Wendy K. Chung * Gladstone Institute of Cardiovascular Disease and University of California San Francisco, San Francisco, CA, USA Deepak Srivastava * Division of Pediatric


Cardiology, University of Utah School of Medicine, Salt Lake City, UT, USA Martin Tristani-Firouzi * Department of Computer Science, Princeton University, Princeton, NJ, USA Olga G.


Troyanskaya * Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Lab, Berkeley, CA, USA Diane E. Dickel * Department of Cardiology, Brigham and Women’s Hospital,


Boston, MA, USA Christine E. Seidman * Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA Bruce D. Gelb * Department of Pediatrics,


Icahn School of Medicine at Mount Sinai, New York, NY, USA Bruce D. Gelb Authors * Felix Richter View author publications You can also search for this author inPubMed Google Scholar * Sarah


U. Morton View author publications You can also search for this author inPubMed Google Scholar * Seong Won Kim View author publications You can also search for this author inPubMed Google


Scholar * Alexander Kitaygorodsky View author publications You can also search for this author inPubMed Google Scholar * Lauren K. Wasson View author publications You can also search for


this author inPubMed Google Scholar * Kathleen M. Chen View author publications You can also search for this author inPubMed Google Scholar * Jian Zhou View author publications You can also


search for this author inPubMed Google Scholar * Hongjian Qi View author publications You can also search for this author inPubMed Google Scholar * Nihir Patel View author publications You


can also search for this author inPubMed Google Scholar * Steven R. DePalma View author publications You can also search for this author inPubMed Google Scholar * Michael Parfenov View


author publications You can also search for this author inPubMed Google Scholar * Jason Homsy View author publications You can also search for this author inPubMed Google Scholar * Joshua M.


Gorham View author publications You can also search for this author inPubMed Google Scholar * Kathryn B. Manheimer View author publications You can also search for this author inPubMed 


Google Scholar * Matthew Velinder View author publications You can also search for this author inPubMed Google Scholar * Andrew Farrell View author publications You can also search for this


author inPubMed Google Scholar * Gabor Marth View author publications You can also search for this author inPubMed Google Scholar * Eric E. Schadt View author publications You can also


search for this author inPubMed Google Scholar * Jonathan R. Kaltman View author publications You can also search for this author inPubMed Google Scholar * Jane W. Newburger View author


publications You can also search for this author inPubMed Google Scholar * Alessandro Giardini View author publications You can also search for this author inPubMed Google Scholar *


Elizabeth Goldmuntz View author publications You can also search for this author inPubMed Google Scholar * Martina Brueckner View author publications You can also search for this author


inPubMed Google Scholar * Richard Kim View author publications You can also search for this author inPubMed Google Scholar * George A. Porter Jr. View author publications You can also search


for this author inPubMed Google Scholar * Daniel Bernstein View author publications You can also search for this author inPubMed Google Scholar * Wendy K. Chung View author publications You


can also search for this author inPubMed Google Scholar * Deepak Srivastava View author publications You can also search for this author inPubMed Google Scholar * Martin Tristani-Firouzi


View author publications You can also search for this author inPubMed Google Scholar * Olga G. Troyanskaya View author publications You can also search for this author inPubMed Google


Scholar * Diane E. Dickel View author publications You can also search for this author inPubMed Google Scholar * Yufeng Shen View author publications You can also search for this author


inPubMed Google Scholar * Jonathan G. Seidman View author publications You can also search for this author inPubMed Google Scholar * Christine E. Seidman View author publications You can


also search for this author inPubMed Google Scholar * Bruce D. Gelb View author publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS F.R., S.U.M., S.W.K.,


A.K., L.K.W., K.M.C., J.R.K., O.G.T., D.E.D., Y.S., J.G.S., C.E.S. and B.D.G. conceived and designed the experiments/analyses. J.R.K., J.W.N., A.G., E.G., M.B., R.K., G.A.P., D.B., W.K.C.,


D.S., M.T.-F., J.G.S., C.E.S. and B.D.G. contributed to cohort ascertainment, phenotypic characterization and recruitment. F.R., S.U.M., A.K., H.Q., N.P., S.R.D., M.P., J.H., J.M.G., K.B.M.,


M.V., A.F., G.M., W.K.C., Y.S., J.G.S., C.E.S. and B.D.G. contributed to whole-genome sequencing production, validation and analysis. F.R., S.U.M., A.K., K.M.C., H.Q., E.E.S., O.G.T., Y.S.,


J.G.S., C.E.S. and B.D.G. contributed to statistical analyses. F.R., K.M.C., J.Z., O.G.T. and B.D.G. developed the HeartENN model. S.U.M., S.W.K., L.K.W., D.E.D., J.G.S. and C.E.S.


generated and analyzed fetal heart and iPSC data. F.R., S.U.M., S.W.K., A.K., L.K.W., K.M.C., Y.S., J.G.S., C.E.S. and B.D.G. wrote and reviewed the manuscript. All authors read and approved


the manuscript. CORRESPONDING AUTHOR Correspondence to Bruce D. Gelb. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no competing interests. ADDITIONAL INFORMATION PUBLISHER’S


NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. EXTENDED DATA EXTENDED DATA FIG. 1 OTHER PIPELINES IDENTIFIED 94%


OF DNVS IN CONTROL TRIOS. Overlaps with DNVs identified in 1,470 control trios with two other pipelines9,10. Of note, a third analysis of these trios did not include _de novo_ calls42. For


consistency with other pipelines, only SNVs were included and variants in LCRs, blacklists, segmental duplications, and repeats were excluded. Together, 94% of _de novo_ SNVs were called by


at least one other pipeline. EXTENDED DATA FIG. 2 CORRELATION BETWEEN PARENTAL AGE AT PROBAND BIRTH AND DNVS/TRIO. Multiple linear regression (_β_paternal_age_x_ + _β_maternal_age_x_ +


_β_intercept + _ε_) was fitted on 763 CHD and 1,611 unaffected individuals to calculate the associations between paternal and maternal age for SNVs, indels, and combined. Regression


coefficients and _P_-values are shown, uncorrected for multiple hypotheses. Sequencing metric comparisons between the centers, colored by cases (_n_ = 763) and controls (_n_ = 1,611), found


moderate bias in DNV quantity, so the background statistical parameter throughout the manuscript is total number of DNVs. Box plots show medians and interquartile ranges. EXTENDED DATA FIG.


3 _DE NOVO_ VARIANT (DNV) CHD-UNAFFECTED BURDEN. The number of DNVs in 184 noncoding annotations (points) genome-wide and within 10 kb of TSSs for 6 gene sets (facets) was counted in CHD


(_n_ = 749) and Simons unaffected (_n_ = 1,611) individuals. The _P_ value threshold (1.5 x 10-4, horizontal blue line) is 0.05 divided by the product of the number of effective annotations


(_n_ = 47) and number of gene sets (_n_ = 7). The _P_ value (_y_-axis) was calculated with a two-sided Fisher’s exact test, the odds ratio (_x_-axis) was DNVsannotation,CHD/DNVstotal,CHD vs.


DNVsannotation,unaffected/DNVstotal, unaffected. No annotations surpassed the _P_ value threshold. CHD, congenital heart disease; HHE, high heart expression. EXTENDED DATA FIG. 4 HEARTENN


PERFORMANCE WAS COMPARABLE TO DEEPSEA. HearENN ROC AUC mean = 0.93 and AUPRC mean = 0.34. ROC AUC, receiver operator characteristics area under the curve; AUPRC, area under the precision


recall curve. EXTENDED DATA FIG. 5 DETERMINING AN ABSOLUTE FUNCTIONAL DIFFERENCE SCORE RANGE. A, Comparison of HGMD disease mutations (blue, _n_ = 1,564) and polymorphism (gray, _n_ = 642)


DeepSEA absolute functional difference scores at varying functional cut-offs illustrates a similar distribution and functionally impactful range ≥0.1 (arrow) for disease mutations. No


statistical significance testing was performed. B, The similarity of null distributions for DeepSEA (gray, downsampled to 184 features) and HeartENN (heart) HGMD polymorphism scores


suggested that the DeepSEA functional score range was also applicable to HeartENN (gray and red _n_ = 642). Scores of 0 set off to left (as 10-4). EXTENDED DATA FIG. 6 SUPPORT FOR HEARTENN ≥


0.1 FUNCTIONAL RANKING. For all DNVs (_n_ = 170,171), overlap between HeartENN ≥0.1 (_n_ = 6,415) and other noncoding scores was assessed with a two-sided Fisher’s exact test (left panel).


Case–control burden for these other noncoding scores (right panel) was statistically significant for CADD ≥15 (_P__Bonferroni_ = 0.019) with a two-sided Fisher’s exact test (cases _n_ =


56,164 and controls _n_ = 114,065). For both panels, unadjusted _P_-values are tabulated, and red indicates a Benjamini-Hochberg-adjusted _P_ value false discovery rate (FDR) < 0.05.


EXTENDED DATA FIG. 7 RELATIONSHIP BETWEEN SEQUENCE LENGTH INSERTED INTO THE PMRPA1 PLASMID AND THE TRANSCRIPT READS/PLASMID COPIES IN MPRAS. The length of the sequences inserted into the


pMPRA1 plasmid (_x_-axis) ranged from 300 to 1,600 bp. After transfection of four libraries (color coded as per key) into the iPSC–CMs, the resulting ratios of transcript reads (mRNA) per


plasmid copies (DNA) are graphed on the _y_-axis, showing no systematic relationship between insert length and transcriptional level. EXTENDED DATA FIG. 8 DNVS WITH A TREND TOWARDS DECREASED


EXPRESSION BY MPRA ASSAY. Box plots for two DNVs for which two MPRA replicates were significantly different but overall statistical significance across all replicates was not attained.


Boxplots show the median fold change (FC), first and third quartiles (lower and upper hinges), and range of values (whiskers and outlying points). Statistical significance was assessed with


two-sided _t_-test Benjamini-Hochberg-adjusted _P_-values. Each boxplot has at least 3 independent experiments with 4 technical replicates each. EXTENDED DATA FIG. 9 FRACTION OF DNVS IN EACH


OF THE CANONICAL VARIANT CLASSES. The fraction was calculated separately within CHD and unaffected subjects for each of the three methods (including overlaps) and the total number of


variants in each group (right table). EXTENDED DATA FIG. 10 DNV ENRICHMENT IN PHENOTYPE SUBGROUPS. A, Enrichment of DNVs with predicted functional impacts (score ≥0.1) for HeartENN (left)


and DeepSEA (right) within phenotype subgroups. B, Enrichment of _de novo_ SNVs with H3K36me3 marks implicated in RNA-binding protein disruption in different subgroups for the most


significant (left) and highest effect size (right) hits. Both A and B were performed with a two-sided Fisher’s exact test (unadjusted _P_-values and 95% C.I.s shown) comparing the fraction


of DNVs in each subgroup (HeartENN ≥ 0.1, DeepSEA ≥ 0.1, etc.) to the same control cohort. For HeartENN, there were _n_ = 4,177 control DNVs with HeartENN ≥ 0.1 and _n_ = 109,888 control


DNVs with HeartENN < 0.1. NDD, neurodevelopmental disorder; ECA, extracardiac anomaly. SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION Supplementary Note and Fig. 1 REPORTING SUMMARY


SUPPLEMENTARY TABLE Supplementary Tables 1–16 RIGHTS AND PERMISSIONS Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Richter, F., Morton, S.U., Kim, S.W. _et al._ Genomic


analyses implicate noncoding de novo variants in congenital heart disease. _Nat Genet_ 52, 769–777 (2020). https://doi.org/10.1038/s41588-020-0652-z Download citation * Received: 09 March


2019 * Accepted: 22 May 2020 * Published: 29 June 2020 * Issue Date: August 2020 * DOI: https://doi.org/10.1038/s41588-020-0652-z SHARE THIS ARTICLE Anyone you share the following link with


will be able to read this content: Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt


content-sharing initiative