Converged IT & the Cloud

2018 Archived Content

OVERVIEW | DOWNLOAD BROCHURE | SPEAKERS | SHORT COURSES

Today, IT professionals are challenged with finding solutions to integrate ever-expanding large volumes of data generated at research labs, pharmaceutical companies and medical centers. In order to do this, one must have the compute power, storage solutions, and analytic capability to make the data from disparate sources such as: omics (genomics, proteomics, metabolomics, etc.), imaging, sensors, and more clinically actionable. The Converged IT & the Cloud conference will bring together key leaders in the fields of cloud computing and data management to share case studies and to discuss the challenges and solutions they face in their centers. Overall, this event will offer practical solutions for network engineers, data architects, software engineers, etc. to achieve the goal of personalized medicine.

Monday, February 12

10:30 am Conference Program Registration Open

DESIGNING THE CLOUD FOR PERSONALIZED MEDICINE

11:50 Chairperson’s Opening Remarks

Kevin Davies, Ph.D., Executive Vice President, Strategic Development, Mary Ann Liebert, Inc.

12:00 pm KEYNOTE PRESENTATION: Everything about 1 Million People: The NIH’s All of Us Research Program

Chris Lunt, CTO, All of Us Research Program, National Institutes of Health

The All of Us Research Program represents a historic effort to gather data over the course of many years from one million or more people living in the United States, with the ultimate goal of accelerating research and improving health. Chris Lunt, CTO of All of Us, will describe the program’s mission and objectives, current status, and plans for national launch and beyond. He will also share his vision of the All of Us ecosystem and the policies and business models necessary to ensure the long term success of the program and of the field of precision medicine more generally.

12:30 Maximize the Power of Cloud Computing for Precision Oncology

Han Liang, Ph.D., Associate Professor, Deputy Chair, Bioinformatics and Computational Biology; Associate Professor, Systems Biology, The University of Texas MD Anderson Cancer Center

Sequencing data of cancer patients have played a key role in precision oncology. To effectively analyze such big data, cloud computing is the key. I will first discuss the key features of effective infrastructure for data store, search and management. Next, I will discuss the best practice for data analysis and results visualization using private and public bioinformatics tools. Finally, I will discuss the utility of data from cancer consortium projects such as TCGA and ICGC in cloud computing.

1:00 Session Break

1:10 Luncheon Presentation: Analyzing Genomic Data at Scale with Google Cloud

Jonathan Sheffi, Product Manager, Genomics & Life Sciences, Google Cloud

Google Cloud enables scientists to change the way they perform research and collaborate with one another. This presentation will highlight how Google Cloud is accelerating life sciences research and finding new ways to innovate.

1:40 Session Break

DESIGNING THE CLOUD FOR PERSONALIZED MEDICINE (CONT.)

2:30 Chairperson’s Remarks

Kevin Davies, Ph.D., Executive Vice President, Strategic Development, Mary Ann Liebert, Inc.

2:40 How the “Serverless” Cloud Model Enables API Ecosystems as Health IT Infrastructure

Jonas Almeida, Ph.D., CTO, Biomedical Informatics, Stony Brook University

Review of the Stony Brook Medical Center experience moving research and operations to Function-as-a-Service (FaaS) Cloud in the era of stateless consumer-facing HL7 FHIR and Machine Learning APIs.

3:10 Delivering Data Analysis Environments on Multiple Clouds

Enis Afgan, Ph.D., Associate Research Scientist, Taylor Laboratory, Computational Biology and Genomics, Johns Hopkins University

Geographic proliferation of cloud computing resources presents an opportunity for processing large biomedical datasets more efficiently but the challenges of accessing multiple providers and assembling the necessary data analysis environments is orthogonal to the concerns of a researcher. In this talk, real-world examples of two models for harnessing cloud resources will be presented: one where a centralized workspace (Galaxy application) utilizes cloud resources behind the scenes and another where powerful analysis environments are rapidly composed on a range of cloud providers. In the process, we will demonstrate cross-cloud resource integration enabled by CloudBridge, as a multi-cloud library, and will show how these resources are deployed and managed using CloudLaunch and CloudMan respectively. Together, these models showcase the potential of leveraging cloud resources for biomedical data analysis.

3:40 High Performance Computing for the Life Sciences
David Hiatt, Director, Product Marketing, WekaIO
Robert Sinkovits, PhD, Director, Scientific Computing Applications, SDSC San Diego Supercomputer Center, University of California, San Diego
Personalized medicine research has become increasingly complex and compute intensive. While new tools and analytical processes hold great promise, they severely stress the supporting IT infrastructure. Modern high performance computing (HPC) systems are designed to provide the performance and scale to accelerate the pace of discovery.

3:55 The Infrastructure Requirements for Geno-Pheno Analysis on a Massive Scale

Omar Serang, Chief Cloud Officer, DNAnexus

Learn how Biopharmas are delivering on the promise of precision medicine by leveraging genomics in R&D. A comprehensive security and privacy framework is required to provide auditability, data immutability and versioning to reproduce research. Learn about required infrastructure to scale geno-pheno analysis, while maintaining stringent security and compliance controls.

4:10 Overcoming the Challenges of Sharing and Migrating Large Datasets in Healthcare and Life Sciences

David Mostardi, Senior Sales Engineer, Engineering Aspera, an ibm company

Healthcare & research are experiencing unprecedented growth in data. Legacy file sharing & Cloud migration tools rely on technology that can’t handle such volume. Learn how IBM Aspera accelerates R&D cycles, speeds data workflows, and impacts clinical and research outcomes.

4:40 Refreshment Break and Transition to Plenary Session

5:00 Plenary Keynote Session (click here for more details)

6:00 Grand Opening Reception in the Exhibit Hall with Poster Viewing

7:30 Close of Day

Tuesday, February 13

7:30 am Registration Open and Morning Coffee

8:00 Plenary Keynote Session (click here for more details)

9:00 Refreshment Break in the Exhibit Hall with Poster Viewing

DATA INTEGRATION, ANALYSIS AND STORAGE

10:05 Chairperson’s Remarks

Randy Qin, Associate Director, Senior Architect, Data Platform, R&D IT Engineering, Merck & Co.

10:15 Data Integration and Analysis Solutions for Lead Discovery and High Throughput Screening

Frank Kloeck, IT Business Partner, BS-ITBPPH-RD-R, Bayer Business Services

This talk will provide an overview of SMOL Bayer Lead Discovery and the Screening Units. The presentation will also describe solutions and methods for data storage and analysis of Lead Discovery data. The challenges of handling historical data of more than 15 years will be discussed.

10:45 Cloud and PaaS for Research Data Integration

Randy Qin, Associate Director, Senior Architect, Data Platform, R&D IT Engineering, Merck & Co.

The integration of internal and external research data is a constant challenge in the life sciences. Although many approaches have been adopted from other industries with varying success, research data integration is confounded by the extreme variety and rate of change. While not a panacea, moving data integration efforts to the cloud does offer a possible path to an information platform that is more accommodating of data variety and more resilient to rapid change. In this talk I will discuss three generations of approaching high variety data integration at scale; from on premise infrastructure, through cloud IaaS, to managed PaaS. I will also discuss efforts around the development of “12-factor” cloud-native micro-service applications and the benefits of a “micro-service harness”.

11:15 Extended Q&A with Session Speakers

11:45 Drug Target Identification As Seen in the Data

Richard K. Harrison, Ph.D., CSO, Clarivate Analytics

12:15 pm Session Break

12:25 Luncheon Presentation: Leveraging Cloud-Based Platforms to Drive Data Strategy for the Life Sciences

Alok Tayi, CEO, TetraScience

Life science companies want to accelerate drug discovery using data analytics and machine learning. Scientific data, however, is not centralized nor standardized: from instrumentation to CRO/CMOs to legacy software. Here, we will discuss how biopharma companies are developing their data strategies and deploying new data platforms.

1:25 Refreshment Break in the Exhibit Hall with Poster Viewing

DATA COMMONS

2:00 Chairperson’s Remarks

Matthew Trunnell, Vice President, CIO, Fred Hutchinson Cancer Center

2:10 The NCI Cancer Research Data Commons: Integrating Heterogeneous Data for Knowledge Discovery

Anthony R. Kerlavage, Ph.D., Chief, Cancer Informatics Branch, National Cancer Institute, Center for Biomedical Informatics & Information Technology

Precision medicine requires identifying the molecular basis for disease and matching targeted therapies to each patient’s unique biology. Cancer researchers need to access, integrate, and analyze data from genomics, metabolomics, proteomics, microbiomics, imaging, clinical research and outcomes, population-based data, and data collected by health care providers and patients themselves. Building upon current systems, we are defining an integrated, cloud-based Cancer Research Data Commons necessary to fully leverage these data.

2:40 Converged IT and Data Commons

Simon Twigger, Ph.D., Senior Scientific Consultant, BioTeam Inc.

Data management is an ongoing and growing challenge in Life Sciences. The Data Commons approach aims to streamline accessibility to the right data and right types of analytics tools and resources by creating a converged platform from the foundational infrastructure to the user interface. This talk will cover the industry trends for developing a strategy around and implementing Data Commons solutions and what role converged IT plays in the process.

3:10 PANEL DISCUSSION: Data Commons

Moderator: Matthew Trunnell, Vice President, CIO, Fred Hutchinson Cancer Center

Panelists:

Lucila Ohno-Machado, M.D., Ph.D., Associate Dean, Informatics and Technology, University of California, San Diego Health

Lara Mangravite, Ph.D., President, Sage Bionetworks

Simon Twigger, Ph.D., Senior Scientific Consultant, BioTeam Inc.

Robert Grossman, Ph.D., Frederick H. Rawson Professor, Professor of Medicine and Computer Science, Jim and Karen Frank Director, Center for Data Intensive Science (CDIS), Co-Chief, Section of Computational Biomedicine and Biomedical Data Science, Dept. of Medicine, University of Chicago

What is a data commons?
Challenges in data commons
Data commons and open sciences
Technology innovations

4:10 Valentine’s Day Celebration in the Exhibit Hall with Poster Viewing

5:00 Breakout Discussions in the Exhibit Hall

These interactive discussion groups are open to all attendees, speakers, sponsors, & exhibitors. Participants choose a specific breakout discussion group to join. Each group has a moderator to ensure focused discussions around key issues within the topic. This format allows participants to meet potential collaborators, share examples from their work, vet ideas with peers, and be part of a group problem-solving endeavor. The discussions provide an informal exchange of ideas and are not meant to be a corporate or specific product discussion.

Creating and Maintaining Hybrid Cloud Environments

Moderator: Lucila Ohno-Machado, M.D., Ph.D., Associate Dean, Informatics and Technology, University of California, San Diego Health

Clinical and human subjects research data on the cloud
Security and safety: Dealing with sensitive data
Cloud computing and data analysis

Compute and Storage Hype vs. Reality

Moderator: Aaron Gardner, Senior Scientific Consultant, BioTeam, Inc.

Death of the hard disk drive?
AMD back in the fight?
NVMEOF && 3D XPoint
Serverless computing
Converged infrastructure

6:00 Close of Day

Wednesday, February 14

7:30 am Registration Open and Morning Coffee

8:00 Plenary Keynote Session (click here for more details)

10:00 Refreshment Break and Poster Competition Winner Announced in the Exhibit Hall

DATA STORAGE AND CLOUD COMPUTING

10:50 Chairperson’s Remarks

Brian D. O’Connor, Ph.D., Technical Director, Genomics Institute, Computational Genomics Platform, University of California, Santa Cruz

11:00 Large-Scale, Cloud-Based Analysis of Genomic Data: Emerging Standards from the GA4GH

Brian D. O’Connor, Ph.D., Technical Director, Genomics Institute, Computational Genomics Platform, University of California, Santa Cruz

The Global Alliance for Genomics and Health (GA4GH) is an international coalition, formed to enable the sharing of genomic and clinical data. Within the GA4GH, the Cloud Workstream is focused on API standards and implementations that make it easier to “send the algorithms to the data”. Specifically, we have developed four API standards that allow the community to share tools/workflows (TRS), execute individual jobs on clouds using a standard API (TES), run full CWL/WDL workflows on execution platforms (WES), and read/write data objects across environments in a cloud-agnostic way (DOS).

11:30 A Data Commons for Protected Health Information: Experience with a HIPAA-Compliant Hybrid Cloud Environment

Lucila Ohno-Machado, M.D., Ph.D., Associate Dean, Informatics and Technology, University of California, San Diego Health

Computing with Protected Health Information (PHI) from Electronic Health Records (EHRs), clinical trial data, and data collected from personal monitoring devices is ideally done in HIPAA-compliant environments. While commercial clouds offer elasticity and economies of scale, hybrid models allow health systems to maintain records locally, while utilizing commercial clouds to expand compute capacity and disaster recovery. I will present our experience at the University of California, San Diego in hosting and computing with PHI in a hybrid cloud environment.

12:00 pm Characterizing Next-Generation Life Sciences Storage Solutions

Aaron Gardner, Senior Scientific Consultant, BioTeam, Inc.

The life sciences' evolving workload, capacity, collaboration, and cost requirements for storage have made architecting next-generation solutions an imperative. This presentation will focus on outlining the needs that new solutions must satisfy as well as discussing how current and future storage technologies may be leveraged to address them.

12:30 Session Break

12:40 Enjoy Lunch on Your Own

1:10 Dessert Break in the Exhibit Hall and Last Chance for Poster Viewing

BUILDING A BETTER CLOUD

1:50 Chairperson’s Remarks

Peyton McNully, CIO, Technology Director, HudsonAlpha Institute for Biotechnology

2:00 Cloud 2.0: We Broke the First One. So, We Replaced It with This…

Peyton McNully, CIO, Technology Director, HudsonAlpha Institute for Biotechnology

Cloud 2.0 is about data. Cloud 1.0 brought tremendous convenience and scale, but broke at the edges. In this candid cloud conversation, we will discuss the origins of cloud, the constraints of clouds, and the benefits of Cloud 2.0 for high throughput population scale problems in Genomics and Research.

2:30 Gatekeeping the Cloud for Your Risk Appetite

Saira Kazmi, Ph.D., Scientific Data Architect, Research Information Technology, The Jackson Laboratory

Moving an IT organization, a pipeline or even a single application to the cloud should always include a scalable approach that reduces the overall risk and cost for the organization. The agility, flexibility, and self-sufficiency in the cloud must be matched by adequate safeguards that satisfy the risk appetite for the organization. This presentation will outline the necessary steps for achieving this balance between freedom in the cloud and risk management for the organization. The goal is to reduce IP, compliance, and financial risk for the cloud implementation while reducing the administrative burden by establishing agreements, documentation, education, training, and technology barriers.

3:00 Cloud-Enabled High-Throughput Bioinformatics Processing at an Academic Medical Center

Annerose Berndt, Ph.D., D.V.M., Vice President, Clinical Genomics, University of Pittsburgh Medical Center

Falling costs for high-throughput genome sequencing have shifted our attention from data generation towards data management and processing. With ever-growing storage and compute demands, the cloud provides the platform and infrastructure for efficient processing of raw genomic data. We will show how the University of Pittsburgh Medical Center (UPMC) has integrated its high-throughput Genome Center with SaaS-supported standard processing pipelines. We will discuss benefits and caveat of a cloud infrastructure and how processing standardization can help variant identification.

3:30 Session Break

TRENDS AND FUTURE DIRECTIONS

3:40 Chairperson’s Remarks

Chris Dwan, Senior Technologist and Independent Consultant

3:45 CLOSING PANEL DISCUSSION: Trends and Future Directions

Moderator: Chris Dwan, Senior Technologist and Independent Consultant

Panelists: Saira Kazmi, Ph.D., Scientific Data Architect, Research Information Technology, The Jackson Laboratory
Annerose Berndt, Ph.D., D.V.M., Vice President, Clinical Genomics, University of Pittsburgh Medical Center
Aaron Gardner, Senior Scientific Consultant, BioTeam, Inc.
Jonathan Sheffi, Product Manager, Genomics & Life Sciences, Google Cloud

This end-of-conference panel builds on the successful new format of the popular “Trends from the Trenches” session at Bio-IT World. The panelists will host an interactive, opinionated, and dynamic exploration of important themes and questions from the conference. We will use an online system to solicit questions and input throughout the conference, and even during the panel itself. This augments the usual practice of audience members coming to microphones, and opens a new and parallel thread in the conversation. Each panelist will provide a short presentation on a particular theme, including Information Security/Privacy, Multi-Cloud Strategy, Data Architecture, and DevOps. The discussion will proceed interactively from those short talks. We will close with a “lightning round,” in which panelists will summarize the major take-home messages from the conference in a succinct and tweetable format.

5:15 Close of Conference Program

March 11-12, 2025

March 12-13, 2025