Privacy-safe AI and Analytics on healthcare data: A MongoDB and Sarus Collaboration

Unlocking the power of healthcare data with MongoDB and Sarus. Enabling privacy-safe AI and analytics for improved patient outcomes.

Analytics
AI
Data Science
Health Care
Josselin Pomat

In today's data-driven world, healthcare organizations have a lot of important information that MongoDB can efficiently store and access. Known for handling large and complex data, MongoDB is perfect for organizing diverse healthcare details like patient records, clinical trials, and genetic data, thanks to its support for hierarchical data structures. However, leveraging this data for AI and analytics applications comes with one significant challenge: healthcare data is highly sensitive and must be protected from unauthorized access.

To tackle this issue, Sarus has developed a solution in collaboration with MongoDB which allows users to run, build analytics pipelines, carry out medical research, or train AI models on MongoDB datasets without compromising security or needing to access  sensitive information directly.

With the Sarus integration to MongoDB, researchers can conduct studies while ensuring data remains completely confidential, thus safeguarding sensitive information and facilitating cutting-edge insights and AI development in healthcare. Here’s how!

How it works

A data owner, such as a pharmaceutical company or a hospital, stores medical data in a MongoDB database. To enable researchers to work on this data for medical purposes without direct access to sensitive information, the data owner can utilize the Sarus application. Researchers submit queries through Sarus, which processes the queries on-the-fly and returns fully privacy-safe results. Sarus also provides synthetic data with only non-sensitive information, aiding researchers during the necessary exploration phase of the data analysis process. Sarus ensures the privacy of the synthetic data by employing Differential Privacy techniques.

To enable researchers to conduct data analysis or machine learning on the medical data, the data owner follows these simple steps:

  1. Connect the Sarus application to the MongoDB collection containing the medical information.
  2. Onboard the dataset using the Sarus administration UI and grant query access to researchers.
  3. Researchers can then query the data using the Python SDK or their preferred BI tools.

Let’s dig deeper into these steps.

Connect the Sarus application to the MongoDB collection

Next, make sure you have enabled the Atlas SQL Interface to obtain a JDBC connection string that you will have to set up in the Sarus admin UI. 

 

Once connected, the data owner can view the available MongoDB collections within Sarus. By selecting the desired collections, the data owner grants researchers access to the data without exposing patient-level information. Researchers can then utilize their familiar data analysis tools, such as Python or Power BI, to work with the data.

Example: Exploring Patient Data for a

Let's consider a scenario where researchers want to conduct a general analysis of hospital data to develop accurate treatment programs based on trends. Using the Sarus SDK, researchers can:

  • Explore the information using synthetic data
Short extract of the Synthetic Data. 

  • Compute aggregated statistics, such as the top 5 represented conditions in the data or correlations between types of sickness and age.
Aggregated statistics on the patient data.
  • Train a machine learning model on the real data to divide patients into cohorts and generate personalized treatment programs.
Simple Machine Learning code.

Sarus and MongoDB have worked to integrate Sarus in the Leafy-Hospital Demo. If you are interested in obtaining more information, please contact MongoDB or Sarus.

Conclusion

By leveraging the power of Sarus and MongoDB, healthcare organizations can unlock the full potential of their data, driving innovation and improving patient outcomes. This solution enables secure, efficient, and effective AI and analytics while maintaining the highest standards of data protection. With Sarus and MongoDB, healthcare organizations can confidently share their valuable data with researchers, fostering advancements in medical research and ultimately benefiting patients worldwide.

About the author

Josselin Pomat

Customer success manager & Product

Ready?

Ready to unlock the value of your data? We can set you up in no time.
main.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

Shell

Subscribe to our newsletter

You're on the list! Thank you for signing up.
Oops! Something went wrong while submitting the form.
32, rue Alexandre Dumas
75011 Paris — France
Resources
Blog
©2023 Sarus Technologies.
All rights reserved.