How Harvard Accelerates Research & Student Learning with DISGENET

Introducing Alba: Harvard Bioinformatics Instructor & Researcher

Alba Gutiérrez-Sacristán, an instructor in Biomedical Informatics at Harvard Medical School, uses DISGENET in both her research and teaching.

Alba Gutiérrez-Sacristán


When they told me, I have been doing a search about this variant here and there and there, it’s like, did you check DISGENET? You could have got this done just in one search. We all like to save time. We all like to be efficient.

– Alba

Objectives: Save Time, Ensure Accuracy & Enhance Student Learning

She teaches courses in systems biology, omics data analysis, and the use of databases, helping students navigate the integration of genomic data into their research. Her research integrates clinical and genomic data, with a focus on comorbidity studies and phenome-wide association studies (PheWAS).

Challenges

  • Managing multiple databases: Searching ClinVar, UniProt, Orphanet and others is tedious and time-consuming.
  • Keeping up with new research: With 8,000+ disease genomics papers published monthly, staying current is impossible.
  • Ensuring data quality: Conflicting or low-quality data makes analysis difficult, leading to potential misinterpretation.
  • Teaching students with varying skill levels: Some struggle with tools like APIs or R, limiting engagement with complex data.
  • Prioritizing variants for analysis: High data volume and unclear allele frequencies make selection challenging.
  • Verifying data provenance: Cross-referencing sources to confirm reliability is time-consuming.
  • Finding population-specific data: Limited demographic information makes tailored analyses difficult.

The Solution: Introducing DISGENET

A Single Platform

DISGENET integrates a dozen curated databases, eliminating the laborious task of searching multiple sources.

Always Up-To-Date

Uses state-of-the-art NLP to extract, standardize, and update disease genomics findings from publications and clinical trials every quarter.

Smart Data Filtering

Metrics like the DISGENET Score rank associations by evidence quality, while the Evidence Index (EI) highlights contradictions in research.

Accessible for All Skill Levels

Beginners use the web interface, while advanced users leverage Cytoscape, REST API, or R package for automation.

Better Variant Prioritization

Alba uses DISGENET’s allele frequency and disease association data (e.g., gnomAD) to focus on high-impact variants.

Simplified Data Provenance

Tracks sources with 1.5M+ supporting publications, reducing the need for manual cross-referencing.

Population-Specific Filtering

The Ancestry filter (introduced in v25.1) enables tailored studies based on demographic data.

In our lab, it’s all about integrating clinical and genomic data. We have information from many databases. So when we need to develop case studies that make clinical sense, DISGENET is a resource I use a lot. All day. It’s on my tabs… Common tabs that you go back to!

– Alba

Results

Teaching Impact

Hands-On Learning: Students apply theoretical concepts through real-world data analysis using SNP-phenotype associations and differential expression, preparing them for future bioinformatics roles.

Efficiency in Teaching: DISGENET streamlines data access, allowing Alba to focus on teaching instead of data aggregation, enhancing the quality of instruction and student experience.

Exposure to Industry-Standard Tools: Students gain experience with tools used in academic and industry bioinformatics research, equipping them for careers in research, clinical settings, or biotech.

Research Impact

Supporting Comorbidity Studies: DISGENET enables Alba to investigate genetic links between diseases that co-occur. By comparing genetic data with clinical phenotypes, Alba identifies shared genetic factors, even when genomic data is missing from electronic health records (EHRs) or claims data.

Facilitating Phenome-Wide Association Studies (PheWAS): DISGENET allows Alba’s team to prioritize high-confidence SNP-phenotype associations based on strong evidence and statistical significance, leading to novel insights into complex traits and genetic underpinnings of diseases.

Integration of Clinical and Genomic Data: DISGENET links clinical phenotypes with genomic variants, helping ground research in the most up-to-date literature and ensuring findings are informed by reliable data.

Refining Research Hypotheses: The platform cross-references genetic and variant data, validating Alba’s hypotheses and ensuring they are based on well-supported genomic associations.

Institutional Benefits

Enhanced Research Throughput and Cost Savings: Universities can reduce costs by consolidating multiple licenses with DISGENET’s platform, speeding up research and lowering time-to-discovery.

Future-Proofing Research: DISGENET’s evolving roadmap, including AI and big data integration, ensures it remains at the cutting edge of genomics and personalized medicine.