Sarus Activate: take action on private data without revealing it

Sarus Activate lets data scientists analyze and act on private data without viewing it, ensuring privacy-by-design in workflows for various industries

Privacy
AI
Analytics
Synthetic Data
Differential Privacy
Marina Kozlova

We are excited to introduce Sarus Activate, a feature that enables data scientists to take action on private data without ever seeing it.

When researchers design an algorithm to predict the risk of patients dropping out of their treatment, they want to send notifications to those at risk. When data scientists build a criminal activity detection model, they want to report suspicious transactions to compliance. When marketers build targeting strategies based on look-alike modeling, they want to load promising audiences into the marketing software.

But what if they cannot access the private dataset at all?

Now, they can use the Activate feature to push the selected user- or transaction-level information to an external database without ever seeing the data. The relevant team (e.g. compliance) takes it from there. This feature unlocks endless possibilities for end-to-end privacy-preserving workflows!

Use case: tracking and reporting suspicious transactions

The banks are required by law to report suspicious transactions to the authorities. Designing data analysis and ML workflows which allow to identify such suspicious transactions is a job for the AML data science team. Reporting them is the responsibility of the compliance team. For that, transactions data has to be made available to the data scientists for analysis, while ensuring protection of highly sensitive information, such as the client id and the transaction details. However, the compliance team needs to have access to such identifying information in order to investigate and, if necessary, report suspicious transactions to the relevant authorities. 

Let’s see how the new Sarus Activate feature enables just that!

Identifying suspicious transactions

With Sarus, data scientists can explore the transaction dataset, test various approaches and extract the transactions which require further investigation by the compliance department without ever seeing a row of source data thanks to synthetic data and differentially-private calculations.

import sarus
from sarus import Client

client = Client(url="https://admin.sarus.tech/gateway", email="analyst@example.com"

# Choose the dataset & table of interest
remote_dataset = client.dataset(slugname="aml_finance")
df_tran = remote_dataset.table(['aml_finance', 'private', 'aml_transactions_2m']).as_pandas(

# Explore the dataset thanks to safe synthetic data generated by Sarus
df_tran.head()
# Explore the dataset thanks to privacy-safe differentially-private calculations
df_tran.use_chip.value_counts()

In our example, the data scientist is taking a very simple approach to identify suspicious transactions and can retrieve only a synthetic version of the list.

# Calculating the mean and the standard deviation for the transaction amount in the dataset
std = df_tran["amount"].std()
mean = df_tran["amount"].mean()

# Extracting the SQL view of interest
query = f"""
SELECT user_id, merchant_city, mcc
FROM aml_finance.private.aml_transactions_2m
WHERE amount>({mean}+1.6*{std})
"""
            
suspicious_transactions = remote_dataset.sql(query).as_pandas()

# Visioning a synthetic version of retrieved transactions
suspicious_transactions.head()

Reporting suspicious transactions

Once the suspicious transactions are identified, it is time to report them to the compliance team. The compliance department needs specific individual information about each suspicious transaction: client ID, amount, type of transaction, etc.

Activate allows just that. The data scientist applies the new operation sarus.push_to_table() to the list of suspicious activities. Although Sarus makes sure that the data scientist can view only a synthetic version of such a list (see code example above), the configuration of their privacy policy allows to execute the entire workflow on the source data and send the individual-level result to a specific database accessible only to the compliance team. 

# Verify what are the available dataconnections to push the results
client.list_writable_dataconnections()

# Push the retrieved transactions to the table accessible only to compliance department
sarus.push_to_table(suspicious_transactions, data_connection_name = "push_activate",
			table='suspicious_schema.suspicious_transactions')

Output:
Whitelisted

No user-level data was ever shared with the data scientist, and with just one click the compliance department can access the list of suspicious transactions and all relevant data to take further action.

We used a very simple example of a SQL query to identify suspicious transactions, but the same goal can be achieved with more complex data science workflows including for instance training a ML model to predict whether or not some transactions are to be reported.

Unlocking endless possibilities

Sarus Activate is a powerful tool that ensures data privacy while enabling actionable insights. Whether in healthcare, finance, or marketing, this feature allows organizations to leverage sensitive data responsibly and securely.

Explore the possibilities with Sarus Activate and transform how your organization handles private data with confidence and integrity. Stay tuned for more updates and innovative features designed to protect and empower your data science initiatives.

For more information on how Sarus Activate can benefit your organization or to schedule a demo, please contact us.

About the author

Marina Kozlova

Customer Success Manager

Ready?

Ready to unlock the value of your data? We can set you up in no time.
main.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

Shell

Subscribe to our newsletter

You're on the list! Thank you for signing up.
Oops! Something went wrong while submitting the form.
128 rue La Boétie
75008 Paris — France
Resources
Blog
©2023 Sarus Technologies.
All rights reserved.