Every rare disease is, by nature, inherently unique. Whether through a confounding combination of symptoms, an unusual progression pattern, or even a highly specific underlying genetic mutation, each rare disease usually stands alone as a medical conundrum that requires equally unique care.
But no matter how frustrating and perplexing each rare disease’s singularity, there is something that they all have in common: rarity itself.
That universal concept of rarity is ultimately what makes the diagnosis and treatment of any rare disease so difficult: There aren’t many diagnosed individuals at any given point in time, so there isn’t enough accessible data for the medical community to learn from. What makes matters worse is the scarce information available is fragmented across hospitals and clinics around the world, meaning any hope of a consolidated data pool that could help doctors and researchers assess key patterns or biomarkers has traditionally been little to none.
For the most part, the medical community has largely focused its attention on rare diseases individually, trying to piece together enough information from sporadic patients to better understand each disease’s unique nature. But one dynamic team of thinkers, the Open Source Imaging Consortium (OSIC), is working with PwC and Microsoft to pioneer a new approach: solving the problem of rarity itself. And the technology they built together might change the entire medical community along with it.
An inspired idea for revolutionizing healthcare
OSIC didn’t necessarily start out looking to revolutionize healthcare’s approach to rare disease. Rather, OSIC is first focusing on one ailment: Idiopathic Pulmonary Fibrosis (IPF), a lung disease that increasingly limits lung effectiveness and functionality. It is difficult to diagnose, incurable, and, eventually, lethal. And as with most rare diseases, the medical community has lacked the consolidated data and collaborative approach necessary to adequately address such factors.
That’s what got OSIC’s leaders thinking: If there were a way to seamlessly aggregate and share relevant information—like anonymized high-resolution computed tomography (HRCT) scans and appended clinical data—around the world, researchers could finally start compiling a patient pool large enough for real study.
“The available qualitative information that one can tease out [of a patient’s information] limits our ability to predict the future for any individual patient,” said Dr. Kevin Brown, Chair of the Department of Medicine at National Jewish Health and OSIC pulmonology lead. “But if we take 1,000 patients with lung fibrosis, we can do a pretty good job of predicting what the overall outcome might be for that population.”
This is when OSIC leadership first approached key thinkers on PwC’s team for help. OSIC’s unusual blending of radiology, pulmonary, and data science specialists had the right idea, but they lacked the technical experience to bring it to life. As OSIC’s overall implementation lead, PwC brought in Microsoft, tapping into its advanced engineering team and Microsoft Azure capabilities that could turn OSIC’s vision into reality. The result: the OSIC Data Repository, a continually growing pool of anonymized HRCT scans and accompanying clinical data, all housed in Microsoft Azure’s cloud.
This global repository is the largest and most diverse of its kind, with a plethora of real-world data that is both multi-ethnic and multi-center. Participating professionals from all over the world provide the data, and any OSIC-created machine-learning algorithms will eventually be made open source for the benefit of patients everywhere. PwC was instrumental in creating the interface for contributors to provide data to that repository.
The solution was so unique that some of the tools Microsoft employed at the time were still emerging, like its FHIR (Fast Healthcare Interoperability Resources) API, which is what facilitates the rapid exchange of data. Others, like the Medical Imaging Server for DICOM (Digital Imaging and Communications in Medicine)—which is what ingests all the medical images—were already open-source Microsoft projects. Regardless, Microsoft was an ideal fit; it could impact the healthcare industry in a profoundly positive way while also garnering critical feedback from project collaborators on some of its most advanced industry tools .
“Microsoft has always been interested in working with customers to manage health data at scale and make it easier to improve the patient experience and coordinate care,“ said Patricia Obermaier, Vice President, US Health and Life Sciences at Microsoft. “This work was the perfect opportunity to help build something that could truly make a difference.”
But here’s the magic: such an approach has ramifications far beyond just IPF. In theory, such an open-source cloud repository could change the game for any rare disease, because it solves for most rare diseases’ common problem: data scarcity.
“Other disease spaces…can look at what we’re doing, and they can replicate this open-source model, collating data from all around the world,” said Dr. Simon Walsh, NIHR Clinician Scientist at the National Heart and Lung Institute.
The healthcare industry, especially in the United States, has eschewed data sharing for complicated reasons—like politics or even market advantage—for decades, and that’s made solutions to rare diseases frustratingly elusive. Elizabeth Estes, OSIC’s Executive Director, sees an opportunity to finally challenge a broken status quo.
“We have to solve the [IPF] problem first, and I don’t want to lose sight of that,” Estes said. “But I think we do have a societal obligation to fix this ecosystem that has allowed this to go on for so long.”
A global effort to inspire change
OSIC’s groundbreaking solution is becoming a proving ground for its eventual scalability, which means innovative ideas are freely welcomed—sometimes from anywhere. Radiologists, clinicians, computational scientists, and industry competitors from around the world collaborated for almost three years on the development of the database itself, and continue to work together to advance digital imaging biomarkers for accurate imaging-based diagnosis, prognosis, and prediction of response to therapy. Any OSIC-created algorithms will be made open-source for the benefit of patients everywhere.
Through Kaggle—the world’s largest data science community platform—OSIC simply provided anonymized HRCT and clinical data pools to vast quantities of data scientists for assessment, any of whom could reasonably piece together an innovative image-recognition algorithm capable of identifying patterns or biomarkers that are crucial to understanding IPF. Just like that, OSIC deputized countless curious minds into cracking IPF’s code. That process hasn’t been perfect—but it does have promise.
“Kaggle is a very useful platform if you have a well-defined problem. It certainly drives enthusiasm, and it’s helped us hugely in our prediction performance journey,” said Dr. David Barber, University College London and OSIC computational science lead. “But can [Kaggle members] give us something tangible that we can take back to the scientists and say, ‘Does this make sense?’”
Like the answer to that question, the path of OSIC’s future is still unknown—both because of the pioneering nature of what OSIC is trying to achieve, and because of the limitless potential of what it can achieve. What started as a journey to help hundreds of thousands of IPF patients across the world could, in time, help millions.
With the help of PwC and Microsoft, Brown, Walsh, Estes, Barber, and the rest of the OSIC team have blazed a trail others can now follow while researching countless other rare diseases. Thanks to OSIC, the field of rare diseases is already very different today than it was just a few short years ago. And the future is full of potential.
“[OSIC] had a lot of great ideas when we first met the team, but they hadn’t taken full advantage of the technology available,” said Will Perry, US Cloud Innovation and Engineering Leader at PwC. “[Now] they proved to the community that this is an incredibly viable path for modernizing clinical diagnostics in a way that can really advance the cause.”
This article was produced by WIRED Brand Lab on behalf of PwC.

