Downloads


Citation Policy PolySearch2 is offered to the public as a freely available resource. Use and re-distribution of the data, in whole or in part, for commercial purposes requires explicit permission of the authors and explicit acknowledgment of the source material (PolySearch2) and the original publication (see below). We ask that users who download significant portions of the database cite the PolySearch2 paper in any resulting publications. Thank you very much.

Downloads for the Legacy PolySearch can be found here.

PolySearch2 Thesaurus (in zipped TSV Format)

Data SetDownload
  Drugs TSV
  Toxins TSV
  Human Metabolites TSV
  Food Metabolites TSV
  Genes and Proteins TSV
  Gene Family TSV
  Metabolic Pathways TSV
  Subcellular Locations TSV
  Tissues TSV
  Organs TSV
  Biological Taxonomy TSV
  Diseases TSV
  Drug Effects TSV
  Adverse Health Effects TSV
  Health Effects TSV
  Wishart Lab's Chemical Ontology TSV
  Gene Ontology Terms TSV
  MeSH Terms TSV
  MeSH Compounds TSV
  ICD-10 Codes TSV

PolySearch2 System Filter Words (in zipped JSON Format)

Data SetDownload
  PolySearch2 Filter Words JSON

PolySearch2 Evaluation DataSets (in zipped TSV Format)

The evaluation page summarizes the performance evaluation and feature comparison of PolySearch 2.0 versus the original PolySearch. Evaluation #1-#4 are conducted using the legacy PolySearch evaluation datasets. Evaluation #1 assesses PolySearch2’s ability to identify disease-gene association. Evaluation #2 assesses PolySearch2’s ability to identify drug-gene/protein associations. Evaluation #3 assesses PolySearch2’s ability to identify protein-protein interactions. Evaluation #4 assesses PolySearch2’s metabolite-gene associations. Evaluation #5 assesses PolySearch2’s ability to identify drugs with significant adverse effects, or ‘dangerous drugs’. Evaluation #6 assesses PolySearch2’s ability to identify toxin-disease association. Evaluation #7 evaluates PolySearch2’s ability to identify toxin-adverse effect associations. Finally, Evaluation #8 evaluates PolySearch2's ability to find associated disease concepts when presented with biomedical question sentences.

All evaluation datasets are available below in zipped TSV format. The Complete version contains full PolySearch2 validation results including Z-scores, result assessments, and marked-up refernce and text-snippets. The Reduced version contains ground-truth only datasets including only associated entity pairs and plain-text reference snippets.

Data SetDownload
  Evaluation 1: Disease / Gene Associations Complete Reduced
  Evaluation 2: Drug / Gene Associations Complete Reduced
  Evaluation 3: Protein / Protein Interactions Complete Reduced
  Evaluation 4: Metabolite / Enzyme Interactions Complete Reduced
  Evaluation 5: Drug with Negative Health Effects Complete Reduced
  Evaluation 6: Toxin with Negative Health Effects Complete Reduced
  Evaluation 7: Toxin / Disease Associations Complete Reduced
  Evaluation 8: BioASQ Question / Disease Associations Complete Reduced


This project is supported by the Canadian Institutes of Health Research (award #111062), Alberta Innovates - Health Solutions, and by The Metabolomics Innovation Centre (TMIC), a nationally-funded research and core facility that supports a wide range of cutting-edge metabolomic studies. TMIC is funded by Genome Alberta, Genome British Columbia, and Genome Canada, a not-for-profit organization that is leading Canada's national genomics strategy with $900 million in funding from the federal government.