Interactive questionnaire for identifying key considerations in your challenge and exploring how the answers to these considerations lead to particular PETs that can be part of the solution to the challenge.
1What is the type of problem?
For the Machine Learning type of challenge, we consider problems where an actual model is involved, either by training or for evaluation. We do not consider other forms of statistical analysis as they are separately examined.
The type of problem Set Intersection can be either a subproblem of any of the other problems, or a problem by itself. An example of the second scenario is when organizations wish to match their datasets without planning to perform a specific analysis per se. In this case, only the Set Intersection route shall be traversed. On the contrary, when additional analysis is intended, the tree shall be traversed twice: once for Set Intersection and once for said analysis (Machine Learning, Statistical Analysis or Synthetic Data Generation).
By Statistical Analysis, we refer to cases where one or more parties wish to compute a set of statistical metrics (e.g. counts, averages, standard deviations, quantiles, histograms, frequency plots, ...) on their data and receive the results.
Synthetic data generation refers to cases where one wishes to generate new data based on some other data's distribution and characteristics, e.g. with the purpose of creating larger datasets for testing. We assume that the original data used to generate synthetic data is sensitive. Else, it is immediately possible to synthesize data without employing some PET to protect the original data from potential reconstruction by using the synthetic data.