Cambridge Healthtech Institute’s Seventh Annual

Integrated Pharma Informatics & Data Science

From R&D to Real World Data

February 16-18, 2015 | Moscone North Convention Center | San Francisco, CA
Part of the 22nd Annual Molecular Medicine Tri-Conference


About this Conference:

Although pharma R&D and technology spending continues to increase, overall productivity is decreasing. To address this productivity gap, pharma and biotech must effectively manage and integrate data from all stages of the pharmaceutical value chain to enable more informed decisions. This is seen in the emerging trend of dedicated data science groups in pharma and the growing numbers of data scientists.

CHI's Seventh Annual Integrated Pharma Informatics & Data Science conference explores the transformation of current IT and informatics teams into data science groups and current progress made by such groups in the analysis, integration and visualization of complex data sets, including genomic, imaging, clinical, external/collaboration, and real world data. Dedicated sessions at the conference will focus on visualization and analysis of big data, and informatics in support of collaboration and externalization.

Day 1 | Day 2 | Day 3 | Plenary Session | Download Brochure 

Monday, February 16

10:30 am Conference Program Registration


11:50 Chairperson’s Opening Remarks

Michael H. Elliott, CEO, Atrium Research & Consulting LLC

12:00 pm Using Informatics to Enable Precision Medicine in Oncology

Susie Stephens, Senior Director, Oncology & West Coast IT, Pfizer

Successfully enabling precision medicine for oncology requires a robust strategy for working with data, implementing analysis pipelines, and sharing results of analyses with scientists. This presentation highlights capabilities that have been enabled in these areas through a close collaboration between Oncology Research and Research IT.

12:30 Pharmacophore Informatics - Integrate, Analyze, and Visualize Pharma Research Data

Andreas Friese, Head, Research-IT, Bayer HealthCare R&D-IT, Bayer HealthCare AG

Pharmacophore Informatics (PIx) is an IT solution for Bayer HealthCare’s Research Organization. The system uses “the power of data” to help the scientists to find the best drug candidates. Pharma research data is integrated, analyzed, and visualized - all in one system for easy usage by the scientist.

1:00 Session Break

1:15 Luncheon Presentation l: Bioinformatic and Information Solutions to Unlock the Power of Omics Data in Precision Medicine

Colin Williams, Director, Product Strategy, Intellectual Property & Science, Thomson Reuters

As volumes of Omics data and the number of applications in research and clinical care rapidly grows, organisations are faced with the problem of extracting relevant insights from that data to understand disease, identify drug targets and interpret patient genomic profiles. This presentation will talk about Thomson Reuters solutions to manage omics data and extract the relevant insights from this complex data through manual curation of information and sophisticated bioinformatics.


PerkinElmer NEW 20091:45 Luncheon Presentation II: Integrating Data is the Key to Translational Research and the Future of Personalized Medicine 

Daniel Weaver, Ph.D., Senior Product Manager, Translational Medicine Informatics, PerkinElmer, Inc.

Emerging technologies are driving Translational Medicine research and PerkinElmer is developing tools, platforms, and algorithms to generate, analyze, visualize and store those data.  This talk will describe how we integrate high-content data with clinical observations to enable our customers to derive and test unique hypotheses.

2:15 Session Break


2:30 Chairperson’s Remarks

Arturo J. Morales, Ph.D., Vice President, Informatics, Beryllium Corp.

2:40 Creating Public & Private Collaborations & Partnerships with Academic, Governmental and Industrial Partners around the Globe to Enable New Innovations to Take Patient Care “Beyond the Pill”

Robert J. Boland, Senior Manager, External Innovation R&D IT, Janssen, Pharmaceutical Companies of Johnson & Johnson

An overview of how translational research can improve the overall results within R&D and the technology architecture provides a new method for working collaboratively across partners. I will discuss how collaborative research can be obtained and remain secure to meet all regulations and show a visual of what a translational research environment would look like and examples of the types of research that can be developed in this type of environment.

3:10 Enabling Discovery Research through Partnerships, Collaboration Tools and Shared Transparency

Arturo J. Morales, Ph.D., Vice President, Informatics, Beryllium Corp.

In the age of externalization and research collaborations, informatics systems play a crucial role and must evolve. Although the exchange of files through portals and email keeps the process going, we must improve the transparency and data flow between systems to lower the physical barriers that we put in place.

3:40 Examining the Landscape of Solutions for Virtualized Research Ecosystems

Michael H. Elliott, CEO, Atrium Research & Consulting LLC

Spending on Virtualized Research Ecosystems is growing over 20%, creating new data integration and information management challenges. Until recently, point solutions such as provisioning an internal system were the norm. However, this is inadequate for a multi-partner ecosystem. This talk examines trends, tools, and the impact of the cloud.

Lab Answer4:10 Helping Our Clients Succeed in Their Distributed R&D Environments by Delivering Excellence in Scientific and Laboratory Informatics

John F. Conway, Global Director, R&D Strategy and Solutions, LabAnswer, LLC

Many organizations  have chosen to  distribute or externalize large portions of their R&D.  Consequently, these same organizations are struggling to collaborate with their external partners.  Sharing and capturing of data and information in these environments is requiring extra (inefficient) effort.  Through discussion and case studies attendees will get to see firsthand how LabAnswer is helping our clients develop strategies, technologies and best practices that help solve some of the headaches associated with the distributed R&D business model. 

4:40 Break and Transition to Plenary Session

5:00 Plenary Session Panel 

6:00 Grand Opening Reception in the Exhibit Hall with Poster Viewing

7:30 Close of Day

Day 1 | Day 2 | Day 3 | Plenary Session | Download Brochure 

Tuesday, February 17

7:00 am Registration and Morning Coffee

8:00 Plenary Session Panel 

9:00 Refreshment Break in the Exhibit Hall with Poster Viewing


10:05 Chairperson’s Remarks

Martin Leach, Ph.D., Vice President, R&D IT, Biogen Idec

10:15 Experience & Challenges of Creating and Implementing a Data Science Function

Juergen Hammer, Ph.D., MBA, Roche Pharmaceutical Research and Early Development; Center Head, Informatics/IT; Global Head, Data Science, Roche Innovation Center New York

The Data Scientist profession has been named “The Sexiest Job of the 21st Century” and is often considered an essential component of Big Data Analytics. However, the Data Scientist Function in Pharma and Biotech is far from being established. I will discuss the organizational positioning of our multi-capability Data Science teams, how we measure performance, and which cultural shifts are required to maximize the impact of Data Science on the drug pipeline. I will provide a number of Data Science examples highlighting the importance of this Function in bridging the gap between massive content and end-user decision making in a world of deficient Application Landscapes.

10:45 Data Sciences at Biogen Idec

Martin Leach, Ph.D., Vice President, R&D IT, Biogen Idec

Biogen Idec established an enterprise-wide Data Sciences capability in July of 2013 to enable innovative data-driven approaches at the intersection of science, medicine and economics. We will discuss lessons learned in organizational structure and governance towards this goal. We will also describe our efforts to build out an inventory of inter-related cross-domain data assets across the Company to permit data exploration and analysis.

11:15 Hiring and Growing a Data Science Team – A Broader Industry Perspective

Jake Klamka, Founder, Insight Data Science Fellows Program

As the amount of data filling their servers has grown exponentially, Silicon Valley technology companies have aggressively scaled up their data science teams to extract value from this data. Data Scientists have a very unique skill set that is often hard to identify and hire for. The Insight Data Science Fellows Program has helped over 150 quantitative Ph.D.s transition to careers in data science at top companies like Facebook, LinkedIn, Twitter, Square, and many others. This talk will share lessons learned in selecting, hiring and growing a data science team based experience developing an elite data science fellowship program.

11:45 PANEL DISCUSSION: Assembly, Creation and Implementation of Data Science Groups for Pharma

Moderator: Martin Leach, Ph.D., Vice President, R&D IT, Biogen Idec

Panelists: Susie Stephens, Senior Director, Oncology & West Coast IT, Pfizer

Juergen Hammer, Ph.D., MBA, Roche Pharmaceutical Research and Early Development; Center Head, Informatics/IT; Global Head, Data Science, Roche Innovation Center New York

Jake Klamka, Founder, Insight Data Science Fellows Program

12:15 pm Session Break

Elsevier12:25 Luncheon Presentation: Text Mining to Support Cancer Immunology Research

Maria Shkrob, Ph.D., Senior Bioinformatics Scientist, Biology Products Research, ELSEVIER

Recent progress in understanding the cellular and molecular mechanisms of cancer immunity has opened a new era in treatment strategies, including approaches that unleash the patient’s own immune system to attack the tumor. To help scientists benefit from previously published research and keep up with the pace of this fast-growing field, we developed a text mining pipeline to aggregate information about immune cell biology. Using this pipeline, we created a knowledgebase that enables researchers to explore the complexity of molecular interactions underlying cancer immunity.

12:55 Session Break 

1:25 Refreshment Break in the Exhibit Hall with Poster Viewing


2:00 Chairperson’s Remarks

Barry Bunin, Ph.D., CEO, Collaborative Drug Discovery (CDD), Inc.

2:10 Enabling Secure Real World Data Exchange and Collaborative Analytics across Healthcare Organizations

Patrick Loerch, Director, Health IT, Information Technology, Merck & Co.

Healthcare reform, the decline in the price of genome sequencing and growing pressures from government and payers to demonstrate the effectiveness of novel therapies are creating a new market centered on access to real world data. The increasing economic pressures of this rapidly growing market are colliding with the need to ensure the security of patient data. We have developed, proven out and executed on a novel approach to enabling the secure sharing and collaborative analysis of real world data across healthcare organizations. As a representative of one of the core consumer bases of the secondary use of real world data we are introducing new modality for secure information exchange and collaborative analytics that benefits and meets the security needs of diverse healthcare organizations.

2:40 Modern Drug Research Informatics Applications to CNS, Infectious, Neglected, Rare, and Commercial Disease

Barry Bunin, Ph.D., CEO, Collaborative Drug Discovery (CDD), Inc.

Layering unique collaborative capabilities upon requisite drug discovery database functionality unlocks and amplifies synergy between biologists and chemists. The application of collaborative technologies to interrogate potency, selectively, and therapeutic windows of small molecule structure activity relationship data will be presented in half a dozen case studies. Novel collaborative technologies in the CDD Vault platform provide an ever-increasing competitive advantage for forward-leaning, open-minded collaborators.

3:10 The New FDA Janus Clinical Trial Repository: Data Harmonization Architecture for Accelerated Regulatory Review

Kelly McVearry, Ph.D., Chief Scientist, Ekagra Technologies in collaboration with FDA

FDA Janus Clinical Trials Repository (CTR) system, a standards-based clinical data repository for all future regulatory submissions and FDA medical reviewer analysis, is based on the Biomedical Research Integrated Domain Group model (BRIDG) and is designed to support industry standards and international initiatives for data interoperability, including inter alia Clinical Data Interchange Standards Consortium (CDISC); Health Level 7 (HL7); International Council on Harmonization adverse event classification system, the Medical Dictionary of Regulated Activity (MedDRA); the International Health Terminology Standards Development Organization (IHTSDO) Clinical Terms (SNO-MED CT), and the International Disease Classification System (ICD). The CTR system promotes interoperability through shared semantics and includes components for frictionless receipt, validation, loading, and management of submitted Study Data Tabulation Model (CDISC SDTM) data, CT standard selected by the FDA. Findings: FDA medical reviewers on the Janus CTR platform evaluation team were able to access harmonized datasets from multiple sources in a common platform with shared semantics, reporting decreased time-to-analysis (n=from 9 months to four hours for one therapeutic area) and improved data quality. Additionally, medical reviewers report the platform's ability to enable rapid meta-analysis, with seamless access to across-study datasets for the first time in FDA history.

3:40 ScienceCloud: Collaborative Workflows in Biologics Research and Development

Ton van Daelen, Ph.D., ScienceCloud Product Director, BIOVIA

Matt Hahn, Ph.D., CTO, BIOVIA

The life sciences industry has undergone dramatic changes and effective global collaboration has become a key success factor in this new age. BIOVIA is providing a hosted and comprehensive solution stack for externalized, collaborative research for pharma/biotech and CROs to address these new challenges. Recently we added support for biologics data management and IP capture. In this talk we will present collaborative and comprehensive capabilities in antibody characterization and development.

3:55 Increasing the Speed and Efficiency of Biomarker Information Analysis and Planning

Adam Carroll, Ph.D., CSO, Amplion Inc.

Information overload is a persistent problem for professionals interested in molecular biomarkers and is exacerbated by "single-use information culture" within organizations. Please join us to hear how Amplion Inc. is solving these problems with BiomarkerBase and biomarker planning products in development.

4:10 Mardi Gras Celebration in the Exhibit Hall with Poster Viewing

5:00 Breakout Discussions in the Exhibit Hall

This interactive session provides attendees an opportunity to choose a specific discussion group to join. Each group has a moderator to ensure focused discussions around key issues within the topic. This format allows participants to meet potential collaborators, share examples from their work, vet ideas with peers, and be part of a group problem-solving endeavor. The discussions provide an informal exchange of ideas and are not meant to be a corporate or specific product discussion.

Informatics in Support of Collaboration and Externalization

Arturo J. Morales, Ph.D., Vice President, Informatics, Beryllium Corp.

  • Advances in technology providing novel methods to work collaboratively across partners
  • Creating public & private collaborations & partnerships
  • Virtualized Research Ecosystems

Genomic Data-Sharing in a Proprietary World

Robert M. Kuhn, Ph.D., Associate Director, Genome Browser, Center for Biomolecular Science & Engineering, University of California Santa Cruz

  • What are benefits/hazards of sharing data?
  • What barriers/incentives keep us from sharing data?
  • How might we break down barriers to data-sharing?

Real World Data

Sebastien Lefebvre, Director R&D IT Platform, Biogen Idec

  • Access to real world data
  • Secure real world data exchange & collaborative analytics

6:00 Close of Day

Day 1 | Day 2 | Day 3 | Plenary Session | Download Brochure 

Wednesday, February 18

7:00 am Breakfast Presentation (Sponsorship Opportunity Available) or Morning Coffee

8:00 Plenary Session Panel 

9:45 Refreshment Break and Poster Competition Winner Announced in the Exhibit Hall


10:35 Chairperson’s Remarks

Robert M. Kuhn, Ph.D., Associate Director, Genome Browser, Center for Biomolecular Science & Engineering, University of California Santa Cruz

10:45 Creating a Truly Innovative Holistic System that Captures and Channels Insights out to the Right People

Sebastien Lefebvre, Director, R&D IT Platform, Biogen Idec

Combination of technologies to harness data, with 21st Century social media concepts to channel the information to the right people. Gain insight into the recent launches of master data management, an information sharing portal and next generation search techniques.

11:15 Combining Machine & Human Intelligence to Successfully Integrate Biomedical Data

Timothy Danford, CDISC Solution Lead, Tamr

One of the biggest data challenges in the biomedical sciences is data variety. Diverse data sources, representing information from different assays and against different experimental targets, are often seen as highly valuable. However, integrating these sources in order to support unified analysis and insight is made difficult by the huge volume of data to be analyzed, the velocity with which the data is produced, and the wide array of formats and data types in which the data is recorded and stored. Modern biomedical research organizations are also realizing that higher experimental costs and more competitive research landscapes require better data integration and informatics practices. It is no longer seen as acceptable for experimental data to be used by only one research project or one group -- integrating and sharing data is a mandate across many research organizations. Integration methods that involve manual curation can produce high quality results, but can be difficult to scale and tend to be extremely expensive. To address these challenges, Tamr has been working with researchers from MIT and many leading healthcare companies to develop a hybrid solution which combines both machine learning and human guidance to significantly reduce the time and effort required for data curation, while achieving the level of precision that data scientists expect. The results of these approaches, both social and technical, will be discussed.

11:45 A Global Approach to Genomic Data Using the UCSC Genome Browser

Robert M. Kuhn, Ph.D., Associate Director, Genome Browser, Center for Biomolecular Science & Engineering, University of California Santa Cruz

A large number of datasets exist to assist in the interpretation of genomic data. Variants are being accumulated in many different databases, and many, often competing, efforts have been made to aggregate the data in super databases, but sharing between them is limited. The UCSC GenomeBrowser is a public resource that operates under a clearinghouse paradigm, pulling in as many useful databases as possible and allowing a researcher to visualize data from any or all of them at once, along with private data, in a consistent display framework.

12:15 pm Session Break

12:25 Luncheon Presentation: LiveDesign - Schrödinger's Next Generation Platform for Collaborative Drug Design

Mark Brewer, Ph.D., Research Leader, Drug Discovery Applications Group, Schrödinger

Medicinal chemists can benefit enormously when collaborating with their computational team members but, in practice, the collaborative process can be challenging and time-consuming. LiveDesign addresses these challenges head-on by providing a platform for enabling real-time collaboration and design by all members of a drug discovery team. In this presentation, we discuss the LiveDesign platform and how it is integrated into existing pharma workflows and infrastructure in order to accelerate small-molecule design processes.

1:00 Refreshment Break in the Exhibit Hall and Last Chance for Poster Viewing


1:40 Chairperson’s Remarks

Ajay Shah, Ph.D., MBA, PMP, Director, Research Informatics & Systems, City of Hope National Medical Center

1:50 Finding Cohorts for Clinical Trials – An Integrated Informatics Approach

Ajay Shah, Ph.D., MBA, PMP, Director, Research Informatics & Systems, City of Hope National Medical Center

Integrating discovery, clinical and translational research informatics systems and data can help solve one of the key challenges in clinical trials – finding cohorts for clinical trials. SPIRIT – Software Platform for Integrated Research Information and Transformation is utilized to encode computable eligibility criteria, identify cohorts from EMR system via i2b2 and perform cohort analytics and visualization.

2:20 Building the High Performance Genomics Big Data Platform to Support Drug Discovery & Translational Medicine

Monica Wang, Ph.D., Lead Software Engineer, Project and Program Manager, Research Systems, Takeda

With the great advance of sequencing technology, vast amounts of genomics data is being generated every day. The ultimate challenge will be to best utilize the data and extract knowledge out of it to advance science and medicine. We will share our experience building the enterprise Genomics Big Data Platform with a focus on high performance to support our internal research effort for drug discovery and translational medicine.

2:50 Hackathons: Feed Innovation, Creativity, and Promote Thinking Outside of the Box

Kristen Cleveland, PMP, Senior IT Project Manager, R&D IT, Biogen Idec

Explore what a Hackathon is, how to plan it, and how to get the best out of the event for your organization.

Thomson Reuters3:20 Drug Repositioning in the Era of Precision Medicine

Chris Willis, Ph.D., Manager, Discovery Solutions Scientists, IP & Science, Thomson Reuters

Craig Webb, Ph.D., CSO, NuMedii, Inc.

3:50 Refreshment Break

4:00 Chairperson’s Remarks

Chris Willis, Ph.D., Manager, Discovery Solution Scientists, IP & Science, Thomson Reuters


4:10 KEYNOTE PRESENTATION: Global Exchange of Human Genetic Data for Medicine and Research

David Haussler, Ph.D., Distinguished Professor and Scientific Director, UC Santa Cruz Genomics Institute, University of California Santa Cruz

Every human disease is a rare disease at the molecular level. No single institute has enough patients to understand any particular molecular subtype. For genomics to benefit medicine and science, we must share data. This presentation outlines the data standards and Application Programming Interfaces developed by the Global Alliance for Genomics and Health that are intended to address this issue, and highlight a few global genomics projects that use them.

4:40 Data Linking and Warehousing to Support Evaluation of Pathogenicity of Genes and Genetic Variants by the Clinical Genome Resource Project

Xin Feng, Ph.D., Assistant Professor, Bioinformatics Research Lab and Department of Molecular and Human Genetics, Baylor College of Medicine

The Clinical Genome Resource (ClinGen) is an NIH-funded program dedicated to creating a database of clinically relevant genomic variants to inform genome interpretation in a variety of clinical contexts. A core component of ClinGen is ClinGenDB, an integration point for data about variants that supports their computational and manual evaluation by experts. The variant data is integrated from clinical and research databases, including several genomics initiatives. Data Warehousing is the traditional approach to data integration that brings all the relevant data physically together. Data Linking is a new approach to data integration that uses new web standards such as JSON-LD, RDF, and Linked Data Platform 1.0 to integrate data across distinct physical locations. In this presentation, we compare the two approaches by going through a number of use cases of data integration in ClinGenDB for the purpose of evaluating pathogenicity of genetic variants. 

5:10 XPRIZE: Transforming Science Fiction into Science Reality through Incentivized Competition

Grant Campany, Senior Director, XPRIZE

Imagine a portable, wireless device in the palm of your hand that monitors and diagnoses your health conditions. That’s the technology envisioned by the $10 million Qualcomm Tricorder XPRIZE competition, and it will allow unprecedented access to personal health metrics. The end result: Radical innovation in healthcare that will give individuals far greater choices in when, where, and how they receive care.

5:40 Close of Conference Program

Day 1 | Day 2 | Day 3 | Plenary Session | Download Brochure 

Premier Sponsors:   


Jackson Laboratory - small logo  


 Precision for Medicine 


Silicon Biosystems 

Thomson Reuters-Large