Preclinical informatics: HDinHD
In our role as a collaborative enabler, CHDI has made a concerted effort to disseminate both preclinical and clinical data to the wider HD scientific community. In support of preclinical research, CHDI has deposited primary data, much of it unpublished, in community databases and have developed a website, Huntington’s Disease in High-Definition (HDinHD) to:
- Provide HD-related primary scientific data;
- Share analyses and computational models derived from such data;
- Provide browsing and data interrogation tools that facilitate data exploration and hypothesis generation; and
- Establish a forum for HD researchers to highlight their data, tools, know-how and insight to the community.
CHDI continues to submit a substantial dataset of gene and protein expression data, across a number of tissues and ages, including from the Mouse Htt CAG-allelic Series project. The data is deposited into databases maintained at the National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI). Gene expression data can be found at NCBI’s Gene Expression Omnibus (GEO). Protein expression data can be found at EBI’s PRoteomics IDEntifications (PRIDE) database.
The Mouse Htt CAG-allelic Series Project is a collaboration between CHDI, Massachusetts General Hospital, and PsychoGenics of a cross-sectional study that has generated a coherent dataset from a series of mHtt knock-in mice (Langfelder et al., 2016) with increasingly long CAG repeats (ranging from 18 to ~175 CAGs), thereby coding for increasingly longer polyglutamine stretches within the mHTT protein. As in other triplet repeat diseases, the age of onset of HD is inversely correlated to the length of repeat expansion. By studying mice with varying CAG repeat lengths at different ages, we are looking to identify disease-related changes that correlate with length of disease-causing CAG repeat expansion. Molecular (mRNAseq, miRNAseq, proteomics) profiling data from >4200 mouse central and peripheral tissues samples have been deposited in community repositories, GEO and PRIDE.
Currently, HDinHD highlights the availability of the Htt CAG-allelic series data and hosts Htt CAG-allelic series mouse behavioral data generated by PsychoGenics (since there is currently no best-practice community repository for such data). Registered users can also find a master sample annotation report that provides key meta-data for all tissue samples and maps all transcriptomics, proteomics, and behavioral results back to individual Htt CAG-allelic series mice. This report provides context, allowing researchers to perform their own integrative analyses over molecular and behavioral results from Htt CAG-allelic Series mice.
HDinHD describes and distributes causal models and model simulation results developed by GNS Healthcare on multi-modal data generated from Htt CAG-allelic Series mice. Datasets are distributed in several different formats to be compatible with best-practice open-source life science toolkits, enabling users to interrogate, visualize and explore networks, pathways and other biological data.
In light of growing evidence from human genome-wide association studies (GWAS) in HD gene-expansion carriers that DNA damage pathway genes contribute to modifying aspects of the disease (Lee et al., 2015), HDinHD provides a comprehensive literature review, gene lists, and visual and computable pathway maps corresponding to four pathways of the DNA damage response mechanism: base excision repair, nuclear excision repair, mismatch repair, and inter-strand crosslink repair.
HDinHD also provides access several web-based tools:
- HD Explorer, Integrated network of HD experimental data curated and analyzed from the literature, community ‘omics repositories and previously unreleased internal CHDI reports.
- HD Proteome Base, a query and visualization tool allowing interrogation of cross-sectional proteomics profiling results from the Mouse Htt CAG-allelic Series Project. The user-friendly web portal allows the researcher to query for proteins and to visualize their expression across the CAG repeat-length series and across different brain and peripheral tissues.
- ASViewer, a visualization tool highlighting cross-sectional transcriptomic and proteomic expression across brain and peripheral tissues.
HDinHD also links out to several other HD research websites, including:
- Genetic Modifiers of Huntington’s Disease (GeM-HD) Consortium website, which provides results from human GWAS looking to identify genetic modifiers of HD.
- BioGemix Suite, including (i) a browsable knowledge-base that integrates information obtained from HD model mice and other models using precision machine-learning, organized by gene products, with links to individual tools and databases, and (ii) Biogemix-3D, a 3D-visualisation of dimensional RNA-seq data in brain structures of HD model mice.
The HDinHD website originated as a collaboration between CHDI and Giovanni Coppola at UCLA.
Frequently Asked Questions
Do I need to register for an account to use HDinHD?
Yes, you must register to access the full site.
Does CHDI make all of its data public?
While we strive to share as much data as possible, some cannot be immediately shared, see CHDI’s Data, Reagents, and Biomaterials Sharing Policy.
Our lab has produced data and tools that would be useful to the HD research community. Can I contribute these data and tools to HDinHD?
I have some suggestions on how to improve HDinHD, how do I share these with you?
Some of the datasets available on HDinHD are complex. Is there someone we can speak to at CHDI to help provide further background and information?
Yes, email us at HDinHD@chdifoundation.org and we’ll be happy to help.
There is a lot of material on HDinHD. How can I tell what is new since I last looked?
Substantial new features, both data and tools, are highlighted on the New in HDinHD tab on the website..
Suggested further reading
- Aaronson J et al. HDinHD: A Rich Data Portal for Huntington’s Disease Research, J Huntingtons Dis., 10, 405-412 (2021). [PMID:34397420]
- Langfelder P et al. MicroRNA signatures of endogenous Huntingtin CAG repeat expansion in mice. PLoS One (2018) 13:e0190550. [PMID:29324753]
- Langfelder P et al. Integrated genomics and proteomics define huntingtin CAG length-dependent networks in mice. Nat Neurosci. (2016) 19:623 [PMID:26900923]
- Alexandrov V et al. Large-scale phenome analysis defines a behavioral signature for Huntington’s disease genotype in mice. Nat Biotechnol. (2016) 34:838 [PMID:27376585]
- Lee JM et al. Identification of genetic factors that modify clinical onset of Huntington’s disease. Cell (2015) 162:516 [PMID:26232222]