Cambridge Healthtech Institute’s Fifth Annual

Bioinformatics for Big Data

Converting Data into Actionable Knowledge

March 7 – 9, 2016 | Moscone North Convention Center | San Francisco, CA
Part of the 23rd International Molecular Medicine Tri-Conference

 

Bioinformatics continues to face challenges of integrating molecular biological information with processes of quality patient care and accelerating the discovery of useful new therapies. The challenges are triggered by the increasing biomedical research that needs to be performed with large-scale data. Scientists are able to study entire systems of data in parallel using a variety of tools and methods. Tremendous computational resources are required to store, compute, analyze, and share. Through lectures and panel discussions, the fifth Annual Bioinformatics for Big Data Program will assemble thought leaders who will present solutions to some of these challenges. You’ll hear the latest developments of bioinformatics database platforms and tools that provide proper scalable archive strategies, optimize computing capacities, simplify data and NGS workflow, flexible and open analysis and interpretation, better collaboration models, and faster time to science. You’ll also hear best practice case studies of taking data from multiple –omics sources and aligning it with clinical action. Turning big data into smart data can lead to real time assistance in disease prevention, prognosis, diagnostics, and therapeutics.

Final Agenda

Monday, March 7

10:30 am Conference Program Registration


GENOMIC AND PRECISION MEDICINE:
EVOLVING SCIENCE, MODELS, AND TOOLS

11:50 Chairperson’s Opening Remarks

Andreas Kogelnik, M.D., Ph.D., Founder and Director, Open Medicine Institute

12:00 pm Genomics Based Medicines for Masses - Problems and Promises

Andreas Kogelnik, M.D., Ph.D., Founder and Director, Open Medicine Institute

Smart data is big data made actionable in real time. It’s about actions that you take in response to data, not just merely collecting the data. Until now, the trend has been to integrate data from multiple sources:instruments, clinical, biochemical, epidemiological, molecular, etc. and then using data mining tools to analyze trends. Turning this data from big to smart can lead to real time assistance in disease prevention, prognosis, diagnostics and therapeutics.

12:30 Big Data in Cancer Research and Precision Oncology

Anthony R. Kerlavage, Ph.D., Chief, Cancer Informatics Branch, Center for Biomedical Informatics & Information Technology, National Cancer Institute

The advancement of cancer research and the promise of precision medicine in oncology can be accelerated by broad access to the growing body of cancer data, sustainable tools, and high-performance computing resources. This presentation will focus on the NCI Genomic Data Commons and Cancer Genomics Cloud Pilots as models for democratizing access to data from The Cancer Genome Atlas (TCGA), other cancer research data, and precision medicine clinical trials. The rationale for the pilots will be presented along with an overview of the three different approaches being taken and the context for a future cancer knowledge commons.

1:00 Session Break

1:15 Luncheon Presentation I: Biomarkers, Brain Regions, and Data Reproducibility

Chris Cheadle, Ph.D., Director, Research, Biology Products, Elsevier R&D Solutions

Comprehensive data-mining of the scientific literature has become an increasing challenge, in particular with regards to disease biology and progression. Elsevier uses natural language processing (NLP) to create very large, structured, and constantly expanding literature knowledgebases. With the addition of highly sophisticated visualization tools, users can interactively explore the vast number of connections created to help unravel disease biology. The utility of this approach will be applied to researching neuropsychiatric diseases for: 1. Finding common and unique biomarker elements, 2. Identifying specific enrichment patterns, and 3. Detecting the most reproducible biomarker findings to support biomarker discovery. In addition, an innovative new taxonomy based on brain region identifications will be presented. Together, these innovations can be applied to rapidly increase the knowledge of diseases based on published findings.


Bina Technologies1:45 Luncheon Presentation II: Assessing and Improving Accuracy of Next-Generation Sequencing Informatics

Hugo Lam, Senior Director, Bioinformatics, Research & Development,
Bina Technologies

Advancements in NGS technologies have produced massive number of short read sequences, making secondary analysis a challenging big data problem. In this seminar, we will talk about the current approaches at Bina in assessing and improving the accuracy of NGS algorithms with research ranging from genomics to cancer genomics and transcriptomics.

2:15 Session Break

2:30 Chairperson’s Remarks

Andreas Kogelnik, M.D., Ph.D., Founder and Director, Open Medicine Institute

2:40 Issues Surrounding Genomically-Guided Individualized Cancer Clinical Trials

Nicholas J. Schork, Ph.D., Professor and Director, Human Biology J. Craig Venter Institute
Recent successes in designing therapies or interventions tailored to a patient’s possibly unique genetic and biochemical profile have raised questions about the broader applicability and adoption of personalized medicine. It is arguable that until one can show unequivocally that the use of personalized medicine protocols will provide better outcomes than standard “one size fits all” approaches to medicine, personalized medicine will be little more than of academic interest. In this talk strategies for, and issues surrounding, vetting personalized medicines are discussed. Many example studies are provided. These strategies include: vetting algorithms for matching approved drugs to patient profiles; N-of-1 and aggregated N-of-1 studies; leveraging personalized thresholds for patient monitoring; and post-marketing, real-time clinical surveillance studies. Ultimately, study designs that focus on the well-being of the patients participating in research protocols are not only consistent with the motivations for personalized medicine but could also motivate patient participation and acceptance, as well as acceptance in the research and health care provider communities.

3:10 Big -Omics Data Coupled with Health Coaching to Optimize Wellness and Minimize Disease

Nathan D. Price, Ph.D., Professor & Associate Director, Institute for Systems Biology

Future medicine will be more proactive and data-rich than anything before possible - and will focus on maintaining and enhancing wellness more than just reacting to disease. We have launched a large-scale 100K person wellness project that integrates genomics, proteomics, transcriptomics, microbiomes, clinical chemistries and wearable devices of the quantified self to monitor wellness and disease. I present results from our proof-of-concept pilot study in a set of 107 individuals (the Pioneer 100 study) over the past year, showing how the interpretation of this data led to actionable findings for individuals to improve health and reduce risk drivers of disease.

3:40 Managing and Analyzing Big Biomedical Data with Globus

Kyle Chard, Ph.D., Senior Researcher and Fellow, Computation Institute, University of Chicago and Argonne National Laboratory

Globus provides software-as-a-service (SaaS) for research data management, including data transfer, synchronization, sharing and publication. Unlike other SaaS providers, Globus provides these capabilities directly from users’ computers, without the need to replicate data in the cloud. Here I describe Globus and discuss how it can be used to manage and analyze big biomedical data.

4:10 Solving the File Exchange Problem for Bioinformatics

Jay Migliaccio, Director, Cloud Platforms & Services, Cloud-On-Demand, Aspera, an IBM Company

As new research techniques create gigabytes of data, the need to ingest and exchange digital files quickly, easily, securely, and with the cloud’s scale-up capacity is critical. A new SaaS platform allows any organization to establish a branded web-based presence for fast, easy and secure exchange and delivery of any size data between separate organizations.

4:40 Refreshment Break and Transition to Plenary Session

5:00 Plenary Session

6:00 Grand Opening Reception in the Exhibit Hall with Poster Viewing

7:30 Close of Day

Tuesday, March 8

7:00 am Registration and Morning Coffee

8:00 Plenary Session

9:00 Refreshment Break in the Exhibit Hall with Poster Viewing


OPTIMIZING HADOOP TO PROCESS BIG DATA

10:05 Chairperson’s Remarks

Martin Gollery, CEO, Tahoe Informatics

10:15 Optimizing AWS Hadoop for Bioinformatics: A Case Study

Zhong Wang, Ph.D., Computational Biologist & Genome Analysis Group Lead, Lawrence Berkeley National Lab & DOE Joint Genome Institute; Adjunct Associate Professor, University of California at Merced

Apache Hadoop-based bioinformatics solutions have been recently developed to tackle the challenge in analyzing the rapid growing next generation sequence (NGS) data. Among them, BioPIG is a toolkit based on Hadoop and PIG that enables easy parallel programming and scaling to datasets of terabyte sizes. However, BioPIG has not been optimized for its performance. When running on Amazon Web Services (AWS), the baseline performance may lead to high computational costs. In this study we aim to optimize Hadoop parameters to improve the performance of BioPIG on AWS. We chose k-mer analysis as an example as it is an essential part of a large number of NGS data analysis tools. We tuned five Hadoop parameters on a customized Hadoop cluster. We found that each parameter tuning experiment led to various performance improvement, and the overall job execution time was reduced by 50% with an optimized parameter setting. We believe this tuning practice provides valuable reference for other similar applications that generate large volume of intermediate data.


EXTRACTING KNOWLEDGE FROM GENE EXPRESSION PROFILES

10:45 Prediction of Protein Structure, Dynamics and Function on the Genomic Scale

Andrzej Kloczkowski, Ph.D., Professor, Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children’s Hospital and Department of Pediatrics, The Ohio State University College of Medicine

Recent progress in the mass-scale sequencing projects has produced enormous numbers of protein sequences, for which crystallographic structures have not yet been determined. Additionally despite the huge investments in high throughput protein crystallography and the important efforts of Protein Structure Initiative (PSI) Centers, the gap between the number of experimentally solved protein structures, and the number of known sequences continues to accelerate. The knowledge of protein structure is critical to comprehend their function, for understanding of molecular mechanisms of disease, and for development of new generations of medicines based on the computer-aided drug design. Because of this there is an urgent need to improve the existing computational methods of structure prediction to reach ultimately the accuracy of prediction comparable to crystallographic or NMR structure determination resolution. Another extremely important aspect of the improved structure prediction is computational design of completely new proteins with desired properties that haven't been yet created naturally by evolution. Computational protein structure prediction and design usually leads not to a single model, but to many alternative models corresponding to local, nonnative energy minima and it becomes critical to develop potentials, scoring functions and model quality assessment and refinement programs that may identify the structural model that is the closest to the native state and successfully refine it. The knowledge of protein structure is the first step to determine its biological function. In the last 15 years it has been shown that protein structure determines protein dynamics and the knowledge of protein flexibility and its fluctuational dynamics is critical for determination of protein function. The theoretical methods of normal mode analysis and elastic network models of biomolecules will be presented. We discuss all these important problems, and propose new methods for genome-wide protein structure and function prediction that have been recently highly successfully blind-tested in Critical Assessment of protein Structure Prediction (CASP) experiments.

11:15 L1000CDS2: LINCS L1000 Characteristic Direction Signature Search Engine Predicts Kenpaullone as a Potential Therapeutic for Ebola

Avi Ma’ayan, Ph.D., Professor, Department of Pharmacology and Systems Therapeutics, Icahn School of Medicine at Mount Sinai

The library of integrated network-based cellular signatures (LINCS) program aims to systematically profile the molecular and phenotypical outcomes of agent perturbed human cells. The variety of agents includes chemical compounds, different micro-environments, endogenous ligands, single gene knockdowns and overexpressions. The LINCS L1000 dataset comprises of over a million gene expression profiles of chemically or genetically perturbed human cell-lines. However, maximally extracting knowledge from such large dataset for further analysis is challenging. We show that processing the L1000 data with the Characteristic Direction method significantly improves signature mappings through several benchmarking pipelines. This processed dataset is served through a state-of-the-art signature search engine called L1000CDS². To demonstrate the utility of L1000CDS² we collected expression signatures from human cells infected with Ebola virus at 30, 60 and 120 minutes after infection. Querying these signatures with L1000CDS² we identified kenpaullone, a GSK3B/CDK2 inhibitor that we show, in subsequent experiments, has a dose dependent efficacy in inhibiting Ebola infection in vitro without causing toxicity. L1000CDS² was also applied to prioritize small molecules that are predicted to reverse expression in 670 disease signatures extracted from the gene expression omnibus (GEO).

11:45 Selected Poster Presentation: Gene Expression Models of BRAF Inhibitor Resistance in Melanoma Predict Drug Repurposing Candidates for Novel Combination Therapies

Kelly Regan, Graduate Student, Department of Biomedical Informatics, Ohio State University

Melanoma is the most deadly form of skin cancer, accounting for nearly 10,000 deaths in the United States in 2014. Approximately 50% of melanomas possess the activating BRAFV600E mutation that can be targeted with BRAF inhibitors; however, drug resistance to theses therapies remains a significant challenge. Improved drug combinations are needed for melanoma patients to curb drug resistance and improve response rates to existing therapies. Traditional approaches for drug discovery are costly and inefficient, and thus even more prohibitive to the approval of effective combination therapy. Estimated success rates for large-scale experimental drug screens are low at 4-10%, and an average of 1 billion U.S. dollars and 15-20 years is required to bring a new drug from the bench to the bedside. Thus, there is an urgent need to discover drug combinations with improved durability in a cost-effective manner. The promise of drug repurposing is that existing FDA-approved therapies may be used for diseases other than the original indication, thus circumventing the lengthy approval process for such drugs with safe toxicity profiles. Bioinformatics-based drug repurposing involves systematically generating hypotheses for novel indications for existing drugs, and includes connectivity mapping methods. Connectivity mapping takes advantage of the observation that when human cells are exposed to drugs or are undergoing a disease process, different parts of the genome change their activity in how certain genes are expressed. One resource for connectivity mapping includes the Library of Integrated Network-based Cellular Signatures (LINCS) database. LINCS is comprised of 476,251 gene expression profiles characterizing genetic and drug perturbation experiments across 77 cellular contexts. These gene activity patterns are then analyzed using computational algorithms to match drugs and genetic perturbations that can "reverse" the genomic signals produced by the disease. The goals of this research are to i) integrate associative and causal genomic features of BRAF inhibitor response in melanoma in a network model for connectivity mapping analyses and ii) prioritize and validate drug repurposing hypotheses predicted to reverse BRAF inhibitor drug resistance using in vitro and in vivo models of melanoma. We analyzed publicly available gene expression profiles from tumors of melanoma patients before BRAF inhibitor treatment (n=28) and after exhibiting resistance (n=31) via RECIST standards in order to define a gene signature characteristic of BRAF inhibitor resistance. We used this gene signature for a connectivity mapping analysis using gene expression profiles associated with genetic perturbations in the A375 BRAF-mutant cell line contained within the LINCS transcriptomics database. We employed the Kolgomorov-Smirnov (KS) statistic to calculate connectivity scores for individual genetic perturbation hypotheses, and selected the set of knocked-down or over-expressed genes whose resultant gene expression profiles were positively correlated with the patient-derived signature of BRAF inhibitor resistance (KS > 0.50). We then prioritized those genes that were shown to confer resistance to BRAF inhibitors in A375 melanoma cells in several large-scale functional screens, including CRISPR-Cas9 and pooled RNAi gene knock-down libraries and a lentiviral gene over-expression library. We found a total of 13 knocked-down and 22 over-expressed genes that were positively correlated with genomic mechanisms of BRAF inhibitor resistance. We conducted network analysis to identify highly connected modules among the 35 gene candidates via the connectivity scores of their respective genetic perturbation LINCS gene expression profiles in the A375 melanoma cell line. Among the over-expressed genes was BTK, which we observed to be significantly over-expressed in patients' resistant tumors as compared to pre-treated tumors and highly connected within the gene network. We hypothesized that ibrutinib, a BTK inhibitor, could reverse BRAF inhibitor resistance in melanoma cells. In an initial validation study, we observed that the combination of ibrutinib and the BRAF inhibitor, vemurafenib, decreased cell viability of vemurafenib-resistant A375 melanoma cells more than vemurafenib alone as determined by an MDS assay. In conclusion, we provide proof-of-concept evidence for integration of associative and casual genomic features of drug response using a BRAF resistance in melanoma model. Furthermore, we provide preliminary in vitro validation for a combination of ibrutinib and vemurafenib in melanoma in order to sensitize resistant melanoma cells to vemurafenib.

Co-authors: Andrew R. Stiff[2]; William E. Carson[1,2,3]; Philip R. O. Payne

[1] Ohio State University, Department of Biomedical Informatics, 1800 Cannon Drive, 250 Lincoln Tower, Columbus, OH 43210 USA; [2] Ohio State University, Arthur G. James Comprehensive Cancer Center and Solove Research Institute, 300 West 10th Avenue, Columbus, OH 43210 USA; [3] Ohio State University, Department of Surgery, N924 Doan Hall, 410 West 10th Avenue, Columbus, OH 43210 USA

12:15 pm Session Break

12:25 Luncheon Presentation I: Systems Biology Approach to OMICs data Analysis in Application to Patient Stratification

Alexander Ivliev, Ph.D., Senior Research Scientist, Thomson Reuters

IBM Logo12:55 Luncheon Presentation II: A High-Performance Analytics Ecosystem for Translational Research

Jane Yu, M.D., Ph.D., Worldwide Industry Architect for Healthcare & Life Sciences, IBM

Research scientists are required to access, analyze, share, and store massive volumes of complex, often unstructured, biomedical data. Learn how IBM’s high-performance analytics ecosystem for translational research including IBM Watson, IBM Power Systems with OpenPOWER can accelerate discovery of treatments tailored to unique patient molecular profiles.

1:25 Refreshment Break in the Exhibit Hall with Poster Viewing


VISUALIZATION AND DATA MAPPING TOOLS REDEFINED

2:00 Chairperson’s Remarks

Martin Gollery, CEO, Tahoe Informatics

2:10 NASFinder: Defining a Network Activity Score

Corrado Priami, Ph.D., Computer Science, The Microsoft Research - University of Trento Centre for Computational and Systems Biology (COSBI)

Network analysis is a well-recognized tool of modern biology and has proven to be a powerful aid in representing and mastering complexity. Sub-networks that are enriched with experimentally produced omics data can help explain properties of the underlying biological processes. We propose a novel method (NASFinder) to identify tissue-specific sub-networks connecting a omics-determined module and the main regulator of this module selected among the molecules with a specific role (e.g., receptors, transcription factors, etc.). Quantification of information flow on the network topology is used to associate nodes with an activity level in information transmission and ultimately to determine the sub-network activity score. Finally, the sub-network is functionally annotated to discover its main biological function. The new NASFinder method has been implemented into a web-based, freely available resource associated with novel, easy to read visualization of omics data sets and network modules. We illustrate an application of the method to transcriptional data sets comprising six time points (0, 6, 48, 96, 192, 384 hours) during the differentiation of SBGS pre-adipocyte cells in vitro. We present two different analysis strategies: time-point analysis by comparing each time point against the control (0h) and time-lapse analysis by comparing each time point with the previous one. NASFinder identified the coordinate production of seemingly unrelated processes between each comparison, providing the first systems view of adipogenesis in culture.

2:40 Genomics for Every Biologist NOW: Introducing the Pantheon of Next- Generation NCBI BLAST Resources

Ben Busby, Ph.D., Genomics Outreach Coordinator, NCBI, NIH

Users of NCBI resources come from varied backgrounds, both biologically and computationally. Therefore, NCBI has developed a range of BLAST tools to address these expanding use cases. SmartBLAST allows users to taxonomically define similar proteins with the click of a button. For those interested in genomics, SRA BLAST allows access to SRA with no knowledge of genomic mapping or command line interfaces -- although there is a command line interface for larger jobs, and we are developing a simple user interface for our BLAST based RNAseq mapper. Finally, for those interested in metagenomics, we have developed moleBLAST, a pushbutton tool that defines operational taxonomic units (OTUs).

3:10 Custom Visualizations to Support Scientific Decision Making

Christian Blumenroehr, Ph.D., Senior Scientist, Roche Innovation Center Basel, F. Hoffmann-La Roche

Data analysis is often a visual process. Especially in times of Big Data, how you visualize your data is very important to be able to draw the right conclusions. Learn new ideas on how to leverage modern HTML5-, JS-, and CSS-based visualizations in combination with a data analysis tool.

3:40 Data Science Driven Pharma R&D Decisions

Timothy Hoctor, Vice President Professional Services, Life Sciences, R&D Solutions, Elsevier
The focus on analysis of large databases, ‘big data’, continues to increase as the collections of scientific observations accumulate. Elsevier has collected tens of millions of facts from scientific literature in the form of semantic triples. In biology the facts are triples like “A upregulates/down regulates/causes B” where A and B can be compounds, diseases drugs or other entity types. The relationship is also qualified by species, tissues, and other variables. In chemistry the relations such as “compound C inhibits target A“ are also qualified by variables such as potency, assay type, species, and variant. The possible combinations of these facts increases exponentially with the number of facts combined. By combining these observations in biology and chemistry we can explore questions such as “based on the known targets drug A inhibits, what other diseases might it treat, based on disease pathways reported for all other diseases” and “given the pathways reported to cause a disease, and compounds known to inhibit those pathways what known compounds/structure scaffolds could be tested to treat the disease”. We will present examples of using data frameworks that combine Elsevier and open source pathway and biological activity databases to explore these questions with the broadest possible databases.

4:10 St. Patrick’s Day Celebration in the Exhibit Hall with Poster Viewing

5:00 Breakout Discussions in the Exhibit Hall
These interactive discussion groups are open to all attendees, speakers, sponsors, & exhibitors. Participants choose a specific breakout discussion group to join. Each group has a moderator to ensure focused discussions around key issues within the topic. This format allows participants to meet potential collaborators, share examples from their work, vet ideas with peers, and be part of a group problem-solving endeavor. The discussions provide an informal exchange of ideas and are not meant to be a corporate or specific product discussion.

Educating End Users about Cloud Computing on Data in Existing Databases

Ben Busby, Ph.D., Genomics Outreach Coordinator, NCBIr

  • Defining what cloud computing means to/for those users
  • Data Streaming
  • Data Literacy
  • Metadata considerations
  • Tool availability and compatibility with datasets

Wednesday, March 9

7:00 am Breakfast Presentation (Sponsorship Opportunity Available) or Morning Coffee

8:00 Plenary Session Panel

10:00 Refreshment Break and Poster Competition Winner Announced in the Exhibit Hall


CONVERGENCE OF LARGE POPULATION & PERSONAL DATA
FOR PATIENT CARE, CLINICAL TRIALS & R&D

10:50 Chairperson’s Remarks

Bonnie Feldman, D.D.S., MBA, Digital Health Analyst and Chief Growth Officer, DrBonnie360

11:00 PANEL DISCUSSION: The Collaboratory at Work in Multiple Sclerosis and Beyond

Marcia Kean, Chairman, Strategic Initiatives, Feinstein Kean Healthcare

Kenneth Buetow, Ph.D., Director, Computational Sciences and Informatics, Complex Adaptive Systems Initiative (CASI), Arizona State University

Robert McBurney, Ph.D., CEO, Accelerated Cure Project for MS

PCORnet, the national research network, is catalyzing collaborations across academe, government, industry and advocacy organizations to change the research enterprise. iConquerMSTM, a Patient-Powered Research Network that recently was awarded Phase II funding, has collected patient-generated health data and a portfolio of emerging collaborations, allowing Big Data analysis by ASU’s Next Generation Cyber Capability of high performance hardware, software, and people. Resulting insights will change clinical practice and accelerate research. The audience will gain the learnings from the iConquerMSTM team, including technical and cultural challenges, patient data collection methods, research collaboration strategy, tools for Big Data integration/analysis and transformation into knowledge, and potential use of this initiative as a model.

12:00 pm Innovation from the Clinical Laboratory – The New Role of -Omics-Based Testing & Decision Support

Andreas Matern, Vice President, Commercial Partnerships and Innovation, BioReference Laboratories, Inc.

In this talk we’ll discuss how clinical laboratories are changing beyond the simple “send sample, give us results” paradigm to an information-driven ecosystem, working in close partnership with providers, pharmaceutical companies, and hospitals to leverage the knowledge they have accumulated to drive new medical discoveries and improve patient care and outcomes. The emphasis will be on bioinformatics, the combination of large data sets, and building systems that work across multiple end users and groups.

12:30 Session Break

12:40 Luncheon Presentation (Sponsorship Opportunity Available) or Enjoy Lunch on Your Own

1:10 Refreshment Break in the Exhibit Hall and Last Chance for Poster Viewing

1:50 Chairperson’s Remarks

Bonnie Feldman, D.D.S., MBA, Digital Health Analyst and Chief Growth Officer, DrBonnie360

2:00 PANEL DISCUSSION: Big Data and Unmet Clinical Needs: Two Problems Separated by a Common Language

Michael Liebman, Ph.D., Managing Director, IPQ Analytics, LLC

Charles Barr, MD, MPH, Group Medical Director and Head, Evidence Science and Innovation, Genentech

Hal Wolf, Director, National Leader of Information and Digital Health Strategy, The Chartis Group

This panel session explores themes of bio/informatics, business, operational, clinical and real world perspectives and how each area works collaboratively to meet unstated medical needs (not just unmet needs). We will explore different models (or business models that have been inverted) to not only show how technology can work or data collection can work but how to work with data to improve a diagnosis and stratify a disease.

3:00 Integrated Analytics of GBM Tumors from FMI and TCGA Patient Data

Eric Neumann, Ph.D., Vice President, Knowledge Informatics, Foundation Medicine, Inc.

The development of diagnostic and predictive analytics is key for effectively leveraging the potential of complete genomic profiles (CGP) to transform the healthcare model. We show that the classifications of genomic alterations can be applied to multiple tumor types as well as different data sets, such as ours and TCGA. This system can then be used to discover clinical relevant relations across sample sets and even predict outcomes.

3:30 Illuminating Druggable Genome: Knowledge Management Center

Oleg Ursu, Research Asst Professor, Department of Internal Medicine, University of New Mexico

The large part of the human genome's role in biology and human disease remains unknown. The IDG project aims to shed light on four important classes of proteins: GPCR, kinases, ion channels, and nuclear receptors. The knowledge management center's main focus is on integration of knowledge on protein structure, function, tissue expression, and role in human diseases. Data from multiple databases and text mining are standardized and integrated into Target Central Resource Database (TCRD). TCRD is accessible through multifaceted web interface and REST API and aims to provide a versatile tool to navigate and assess druggability of understudied proteins.

4:00 Session Break

4:10 Chairperson’s Remarks

Bonnie Feldman, D.D.S., MBA, Digital Health Analyst and Chief Growth Officer, DrBonnie360

4:15 Engaging Both Commercial and Altruistic Collaborative Drug Discovery

Barry Bunin, Ph.D., CEO, Collaborative Drug Discovery (CDD)

CDD Vault is a collaborative platform that is intuitive enough for non-specialists (biologists, chemists, informaticians, project managers) from drug discovery project teams to all embrace. By tapping into business, scientific, and aesthetic drivers across the drug discovery marketplace, the “rising tide lifts all ships”. New informatics technologies will be shared.

4:45 Digital Tools for the Microbiome – An Emerging Field in Genomics and Medicine

Bonnie Feldman, D.D.S., MBA, Digital Health Analyst and Chief Growth Officer, DrBonnie360

New research shows an association between changes in the microbiome in Lupus and Rheumatoid arthritis. With the convergence of large population data sets and personal data we are beginning to make progress in research, development and clinical trials in autoimmune disease. This talk will highlight new companies using data and digital tools to improve our understanding and treatment of autoimmunity.

5:15 A Full Stack Solution to Pharmacogenomics

Greyson Twist, Software Engineer, Center for Pediatric Genomic Medicine, Children’s Mercy Genome Center

Realizing the world personalized medicine requires integrating data from many disparate sources. To accomplish this we are developing a 3 tiered software solution. Astraea to handle locus specific knowledge management through expert curation, Constellation to handle locus allele identification from Next-gen data, and Astronomer to handle drug phenotype prediction. Each of these tools has unique problems, data source integration, standardization, and biologically driven heuristic choice.

5:45 Close of Conference Program