Data Mining in Drug Development and Translational Medicine
Author: Hermann Mucke, PhD
The biopharmaceutical industry is grappling not only with sheer data volume but with the ability of researchers to extract information through identification and contextual analysis of those data that are relevant to a particular set of investigations. This report examines:
- Techniques, technology, and software used in life science data mining
- Data mining for early preclinical safety assessments
- Data mining in clinical trials
- Data mining in pharmacovigilance
- Business models and solutions in drug development bioinformatics
The mountain of data generated and stored is growing ever-higher. The information content of life science data is multidimensional and not readily accessible by merely looking at the output. Unless such data can be put into proper context and interpreted—i.e., mined—their value is only in their potential. Data Mining in Drug Development and Translational Medicine examines data mining challenges and approaches in pharmaceutical R&D.
The pharmaceutical industry has made decisive moves to improve the predictiveness of early-stage drug safety testing. These efforts generate large amounts of data, in which the clue to safety-related, potential “red flags” can be buried. In this context we examine options for mining types of text data, “pathway mining” for pathway-related effects of a compound, and the multidimensional output of high-content screening methods. Also examined are approaches to mining data generated in preclinical trials for identification of toxicity signatures.
Much more clinical trial data are captured than are actually analyzed to build the regulatory data file. Clinical databases can thus be mined for information that the respective study was not explicitly designed to provide. Data Mining in Drug Development and Translational Medicine describes how data mining from investigational human trials can reveal hidden information that has the potential to massively improve the understanding of drug mechanisms, the efficacy and side effect behavior of drug candidates in various patient subpopulations, and even the integrity of clinical investigators. We look at text mining of literature and patent databases, which offers the possibility for knowledge discovery concerning activity in a particular field of therapeutic development from many different angles.
Pharmacovigilance is a field where large volumes of interconnected data have to be analyzed in many dimensions. We describe various databases used in support of post-market drug safety evaluation, including those maintained by the FDA, WHO, and EMEA. Data mining algorithms applied to pharmacovigilance databases and efforts to bring separate databases into full compatibility with one another are described. Case studies illustrating the use of data mining and analysis to investigate relationships between marketed drugs and adverse events are presented.
Data Mining in Drug Development and Translational Medicine concludes by profiling the most significant vendors that either offer dedicated solutions for data mining in drug development and pharmacovigilance, or provide more general commercial data mining solutions that have been successfully adapted and applied to these endeavors.
About the Author
Hermann A.M. Mucke, PhD, spent 17 years in academia and industry before he founded H.M. Pharma Consultancy (www.hmpharmacon.com) in 2000 to become an independent pharmaceutical consultant, analyst, and science author. His last industry position was Vice President R&D in a European pharmaceutical company, which he helped to take public on the Frankfurt Stock Exchange in 1999. Since then, Dr. Mucke, who holds a PhD in biochemistry from the University of Vienna (Austria), became a consultant and advisory board member for several European and American pharmaceutical companies and a regular reviewer of drugs and patents for Thomson Current Drugs and Ashley Publications. Dr. Mucke is based in Vienna.