Statistics


Thesaurus Statistics

PolySearch 2.0 significantly expanded custom thesauri from 9 to 20 categories, and from just 3000 to over 1.13 million term entries. In particular we have expanded the thesauri to include toxins, food metabolites, biological taxonomies, pathways, as well as Gene Ontology, MeSH terms, and ICD-10 codes. The thesauri also feature many manually curated terms and synonyms for health effects, drug effects, adverse effects, and chemical taxonomies. The table below summarizes the number of term entries and synonyms for each thesaurus

Thesaurus Name Number of Terms Number of Synonyms
Gene Families 404 948
Adverse Health Effects 135 711
Health Effects 161 507
Gene Ontology 40535 110477
Toxins 3713 39095
Biological Taxonomy 607031 775728
Drugs 7670 37331
ICD-10 Codes 91737 155331
Chemical Ontology 4017 10098
Tissues 954 984
MeSH Terms 26956 215327
Food Metabolites 27509 39278
Genes and Proteins 27994 287827
Drug Effects 424 590
Metabolic Pathways 456 456
MeSH Compounds 221986 716676
Human Metabolites 41793 381195
Organs 104 201
Subcellular Locations 74 175
Diseases 27658 76001
Total 1131328 2848936

Database and Corpora Statistics

PolySearch 2.0 significantly expanded number of text corpora and databases (by >80%) to include a total of 6 free-text corpora and 14 bioinformatics databases. The latest server searches against over 43 million articles covering Medline abstracts, PubMed Central full-text, Wikipedia articles, US Patent abstracts, and open access textbooks.

Total Number of indexed Records in OMIM 23219
Total Number of indexed Records in T3DB 3713
Total Number of indexed Records in HMDB 41513
Total Number of indexed Documents in MEDLINE 27208664
Total Number of indexed Documents in Wikipedia 7619689
Total Number of indexed Documents in USPTO 7996999
Total Number of indexed Records in FooDB 27509
Total Number of indexed Records in KEGG Reactions 9538
Total Number of indexed Records in Gene Ontology 40535
Total Number of indexed Records in DailyMed 2745
Total Number of indexed Records in KEGG Pathways 456
Total Number of indexed Documents in NCBI Books 19066
Total Number of indexed Documents in MedlinePlus 1901
Total Number of indexed Documents in PubMed Central 704539
Total Number of indexed Records in SwissProt 541561
Total Number of indexed Records in MetaCyc 3810
Total Number of indexed Records in GAD 167298
Total Number of indexed Records in HPRD 18863
Total Number of indexed Records in DrugBank 6825


This project is supported by the Canadian Institutes of Health Research (award #111062), Alberta Innovates - Health Solutions, and by The Metabolomics Innovation Centre (TMIC), a nationally-funded research and core facility that supports a wide range of cutting-edge metabolomic studies. TMIC is funded by Genome Alberta, Genome British Columbia, and Genome Canada, a not-for-profit organization that is leading Canada's national genomics strategy with $900 million in funding from the federal government.