St. Jude Cloud, Largest Public Repository of Pediatric Cancer Genomics Data, Launches for Researchers Worldwide


On April 12, St. Jude Children’s Research Hospital launched the St. Jude Cloud, an online data-sharing and collaboration platform that provides researchers access to the world's largest public repository of pediatric cancer genomics data. Developed as a partnership among St. Jude, DNAnexus, and Microsoft, the St. Jude Cloud provides accelerated data mining, analysis, and visualization capabilities in a secure cloud-based environment.

“Sharing research and scientific discoveries is vital to advancing cures and saving lives, especially in rare diseases like pediatric cancer,” said James R. Downing, MD, St. Jude President and Chief Executive Officer. “St. Jude has shared data and resources since its founding, and collaboration with researchers across the world is at the core of our mission. St. Jude Cloud offers researchers access to genomics data and analysis tools that will drive faster progress toward cures for catastrophic diseases of childhood.”

St. Jude Cloud is a unique resource in the fight to advance cures for pediatric diseases, offering researchers the following capabilities:

  • Largest Public Repository of Pediatric Cancer Genomics Data

The interactive data-sharing platform allows scientists to explore more than 5,000 whole-genome, 5,000 whole-exome, and 1,200 RNA-sequencing datasets from more than 5,000 pediatric cancer patients and survivors. By 2019, St. Jude expects to make 10,000 whole-genome sequences available on St. Jude Cloud.

These data have been generated from three large St. Jude–supported genomics initiatives: the St. Jude—Washington University Pediatric Cancer Genome Project, designed to understand the genetic origins of childhood cancers; the Genomes for Kids clinical trial, focused on moving whole-genome sequencing into the clinic; and the St. Jude Lifetime Cohort study (St. Jude LIFE), which conducts comprehensive clinical evaluations on thousands of pediatric cancer survivors throughout their lives.

Access to data is simple, fast, and does not require downloading prior to exploration. Researchers may also upload their own data in a private, password-protected environment to explore using tools available on the St. Jude Cloud platform.

  • Interactive Data Tools

As well as high-quality next-generation sequencing data, St. Jude Cloud features a collection of bioinformatics tools to help both experts and nonspecialists gain novel insights from genomics data. These tools include validated data-analysis pipelines and interactive visualization tools to make it easier to make discoveries from large data sets. Data and results can be securely shared with collaborators within the platform.

The platform enables researchers to explore St. Jude data or their own results using innovative, interactive visualizations powered by ProteinPaint, the genomic visualization engine developed at St. Jude. The ProteinPaint visualizations allow users to rapidly navigate through the genome and identify genetic changes linked to cancer development. St. Jude Cloud tools also produce custom visualizations of the user’s own research data for exploration or comparison with St. Jude–generated data.

  • Verifying Research Discoveries Faster

A St. Jude scientist was able to use the St. Jude Cloud to replicate, in just a few days, experimental findings that originally took the research team more than 2 years to make. The original team discovered mutations connected to ultraviolet (UV) damage in a B-cell leukemia in work that was recently published in Nature. The intriguing finding led the team to ask whether other leukemia samples not included in the original study might have similar patterns of mutations. They turned to the high-quality data sets available in the St. Jude Cloud, where the rapid computing capabilities of the platform enabled them to rediscover the same UV-linked mutational signature in pediatric B-cell leukemia patients. Identification of these additional samples will help researchers understand how UV damage could be linked to a blood cancer and potentially point to new avenues for therapy.

  • Collaboration to Advance Cures

The data available on the St. Jude Cloud represent a key resource to understanding the genetic roots of childhood cancer. St. Jude's partnership with DNAnexus and Microsoft allows access to these data to harness the collective power of the global research community to advance precision medicine for rare pediatric diseases like cancer.


The data available through St. Jude Cloud are stored on Microsoft Azure, which can handle datasets on the massive scale required for large genomics studies such as those developed by St. Jude. Microsoft understands the complexities of large-scale genomics data and has processed half a petabyte of data for St. Jude Cloud to date.

“Health and technology partnerships are central to the advancement of scientific breakthroughs; allowing great minds and passionate hearts to work together with the common goal of ensuring one day, life-threating diseases in children are no longer a reality,” said Peter Lee, PhD, Corporate Vice President of AI and Research at Microsoft. “We are extremely proud to collaborate with our research partners at St. Jude and DNAnexus and address the challenges of technological limitations, such as storage and the speed of accessing vast amounts of pediatric cancer data, and look forward to the progress that St. Jude Cloud will bring.”

DNAnexus, the biomedical informatics and data management company for St. Jude Cloud, leverages Azure to provide an open, flexible, and secure cloud platform that supports Microsoft Genomics service as well as other genomics analysis tools. Researchers around the world are able to access tools and diverse data sets in a secure and collaborative ecosystem.

“Collaboration fuels scientific advancements,” said Richard Daly, Chief Executive Officer at DNAnexus. “Whether you are working together across hallways or international borders, researchers need a secure space to foster collaboration and share data and tools.”

“St. Jude Cloud is a powerful resource to drive global research and discovery forward,” said Jinghui Zhang, PhD, Chair of the St. Jude Department of Computational Biology and Co-Leader of the St. Jude Cloud project. “Providing genomic sequencing data to the global research community and making complex computational analysis pipelines easily accessible will lead to progress in eradicating childhood cancer.”

The content in this post has not been reviewed by the American Society of Clinical Oncology, Inc. (ASCO®) and does not necessarily reflect the ideas and opinions of ASCO®.