Federated Learning Balances Machine Learning with Patient Privacy
Federated learning, in which data sources for machine learning are distributed across multiple locations, is gaining traction in the healthcare industry.
Instead of centralizing data in a central server, federated learning allows patient data to remain on-premise in the hospital. This enables hospitals and other healthcare organizations to taking advantage of machine learning, while protecting patient privacy.
Federated learning can “train a model using data stored at multiple different hospitals without the data ever leaving a hospital’s premises or touching a tech company’s servers,” explained Karen Hao in a recent MIT Technology Review article.
“It does this by first training separate models at each hospital with the local data available and then sending those models to a central server to be combined into a master model,” Hao related.
When a hospital acquires more data, it can download the latest master model from the server, update it with the new data, and send it back to the server.
“Throughout the process, raw data is never exchanged—only the models, which cannot be reverse-engineered to reveal that data,” she noted.
“There is a false dichotomy between the privacy of patient data and the utility of the data to society,” Ramesh Raskar, an MIT associate professor of computer science, told MIT Technology Review. “People don’t realize the sand is shifting under their feet and that we can now in fact achieve privacy and utility at the same time.”
At the same time, federated learning has several problems that need to be worked out. For example, every hospital has to have the infrastructure and personnel for training machine-learning models, and data collection needs to be standardized across hospitals for federated learning to work.
Raskar is working on solving the problems of federated learning. One solution is called split learning in which each hospital trains separate models but only goes half way. The partial models are sent to a central server, where they are combined, and training is completed. This approach helps lessen the computational burden on hospitals.
Some companies, such as IBM Research and Paris-based startup Owkin, are working on applying federation learning to tackle healthcare challenges.
Owkin is using federated learning to predict the resistance of cancer patients to certain treatment and drugs. The startup is collaborating with U.S. and European cancer centers to use their data for its models. The company is developing a new model that predicts survival odds for a rare type of cancer based on a patient’s pathology images.
“The biggest barrier in oncology today is knowledge. It’s really amazing that we now have the power to extract that knowledge and make medical breakthrough discoveries,” Owkin Founder Thomas Clozel told MIT Technology Review.
Machine learning is one part of the broader artificial intelligence concept. A number of recent market reports predict robust growth for artificial intelligence in healthcare, ranging between a 47 percent and 50 percent compound annual growth rate.
Factors fueling the use of artificial intelligence include large and complex data sets, soaring healthcare costs, improving computing power, and declining cost of hardware.
At the same time, growth could be held back by practitioners’ reluctance to adopt AI technology, lack of skilled workers, ambiguous regulatory guidelines for medical software, and fear of AI's impact on healthcare employment and care.
A recent report by Boston Consulting Group advised healthcare organizations to embrace artificial intelligence tools that provide clinical decision support, diagnostic imaging analysis, patient monitoring, and process automation.
“The journey to integrate AI into strategies and operations must be a sustained one. But even companies that have yet to invest in AI decisively can make some smart, low-risk moves to either enhance the positive value shifts or minimize the negative impacts,” BCG observed.