Introducing Alba: Harvard Bioinformatics Instructor & Researcher
Alba Gutiérrez-Sacristán, an instructor in Biomedical Informatics at Harvard Medical School, uses DISGENET in both her research and teaching.

“When they told me, I have been doing a search about this variant here and there and there, it’s like, did you check DISGENET? You could have got this done just in one search. We all like to save time. We all like to be efficient.”
– Alba
Objectives: Save Time, Ensure Accuracy & Enhance Student Learning
She teaches courses in systems biology, omics data analysis, and the use of databases, helping students navigate the integration of genomic data into their research. Her research integrates clinical and genomic data, with a focus on comorbidity studies and phenome-wide association studies (PheWAS).
Challenges
- Managing multiple databases: Searching ClinVar, UniProt, Orphanet and others is tedious and time-consuming.
- Keeping up with new research: With 8,000+ disease genomics papers published monthly, staying current is impossible.
- Ensuring data quality: Conflicting or low-quality data makes analysis difficult, leading to potential misinterpretation.
- Teaching students with varying skill levels: Some struggle with tools like APIs or R, limiting engagement with complex data.
- Prioritizing variants for analysis: High data volume and unclear allele frequencies make selection challenging.
- Verifying data provenance: Cross-referencing sources to confirm reliability is time-consuming.
- Finding population-specific data: Limited demographic information makes tailored analyses difficult.
The Solution: Introducing DISGENET
A Single Platform
DISGENET integrates a dozen curated databases, eliminating the laborious task of searching multiple sources.
Always Up-To-Date
Uses state-of-the-art NLP to extract, standardize, and update disease genomics findings from publications and clinical trials every quarter.
Smart Data Filtering
Metrics like the DISGENET Score rank associations by evidence quality, while the Evidence Index (EI) highlights contradictions in research.
Accessible for All Skill Levels
Beginners use the web interface, while advanced users leverage Cytoscape, REST API, or R package for automation.
Better Variant Prioritization
Alba uses DISGENET’s allele frequency and disease association data (e.g., gnomAD) to focus on high-impact variants.
Simplified Data Provenance
Tracks sources with 1.5M+ supporting publications, reducing the need for manual cross-referencing.
Population-Specific Filtering
The Ancestry filter (introduced in v25.1) enables tailored studies based on demographic data.
” In our lab, it’s all about integrating clinical and genomic data. We have information from many databases. So when we need to develop case studies that make clinical sense, DISGENET is a resource I use a lot. All day. It’s on my tabs… Common tabs that you go back to! ”
– Alba
Results
Teaching Impact
Hands-On Learning: Students apply theoretical concepts through real-world data analysis using SNP-phenotype associations and differential expression, preparing them for future bioinformatics roles.
Efficiency in Teaching: DISGENET streamlines data access, allowing Alba to focus on teaching instead of data aggregation, enhancing the quality of instruction and student experience.
Exposure to Industry-Standard Tools: Students gain experience with tools used in academic and industry bioinformatics research, equipping them for careers in research, clinical settings, or biotech.
Research Impact
Supporting Comorbidity Studies: DISGENET enables Alba to investigate genetic links between diseases that co-occur. By comparing genetic data with clinical phenotypes, Alba identifies shared genetic factors, even when genomic data is missing from electronic health records (EHRs) or claims data.
Facilitating Phenome-Wide Association Studies (PheWAS): DISGENET allows Alba’s team to prioritize high-confidence SNP-phenotype associations based on strong evidence and statistical significance, leading to novel insights into complex traits and genetic underpinnings of diseases.
Integration of Clinical and Genomic Data: DISGENET links clinical phenotypes with genomic variants, helping ground research in the most up-to-date literature and ensuring findings are informed by reliable data.
Refining Research Hypotheses: The platform cross-references genetic and variant data, validating Alba’s hypotheses and ensuring they are based on well-supported genomic associations.
Institutional Benefits
Enhanced Research Throughput and Cost Savings: Universities can reduce costs by consolidating multiple licenses with DISGENET’s platform, speeding up research and lowering time-to-discovery.
Future-Proofing Research: DISGENET’s evolving roadmap, including AI and big data integration, ensures it remains at the cutting edge of genomics and personalized medicine.
