Video: Respecting Privacy with Look-Alike Data Sets by Tim Garnsey

Presented at Data Science Sydney, April 2018.

Abstract: With companies like Cambridge Analytica in the news, people are understandably worried about how companies store and handle data about them. Sensitive and personally identifying data is needed by companies to run their services however, and companies may need to process it on many different systems and environments. This talk describes how to build “look-alike” data sets that have many of the same statistical properties as source data sets they are generated from, but no longer contain sensitive data. By using such synthetically generated look-alikes in many development and testing environments, the true source data can be kept more securely in fewer locations.

Previous article Why a Professional Body for...

Data Scientist. Two words, when combined, and placed under one's name on a...

Next article Spatial Analysis in Australia

What data best describes a location? Is it the age of the people living nearby,...

Get in touch

To contact us, please fill in your details.
We're fast responders.

Do you have data?
No
Yes