Sarus application is configured by default in a way that all the retrieved results are anonymous.
To reach this high standard, Sarus allows to retrieve only statistical and aggregate information. However, privacy research and real-word cases show that individuals can be re-identified using such statistical and aggregate information, in particular when it is combined with auxiliary information.
Sarus offers an additional layer of data protection to mitigate the risk of re-identification:
Differential Privacy.
Differential Privacy is a rigorous mathematical definition of privacy that quantifies risk and provides a guarantee that no significant information related to a specific person can be distinguished in the query result. This allows to irreversibly prevent re-identification of individuals by singling out, linkability or inference, no matter what additional information an attacker may possess.
Sarus also makes available a fake (synthetic) dataset for each data project. Such synthetic dataset is also generated with Differential Privacy, and is therefore completely anonymous.
Although the definition of “anonymous information” may vary depending on the legislation, using Differential Privacy guarantees that no information related to a specific person can be found in the query result. Therefore, such a result satisfies even the most strict definition of “anonymous”.
Note that the data owner can allow to retrieve the query results produced without Differential Privacy, as an exception (see #2 above). In such a case, the data owner shall assess the corresponding risks and take decisions on a case-by-case basis. Sarus provides guidelines and training on the application configuration.