Introduction - If you have any usage issues, please Google them yourself
Evaluating the performance of systems is crucial when vendors or researchers are developing new technologies. But such uation tasks rely heavily on actual data and query workloads that are often unavailable to researchers due to privacy restrictions. To overcome this barrier, we propose a framework for the release of a synthetic which accurately models selected performance properties of the original . We improve on prior work on synthetic generation by providing a formal, rigorous guarantee of privacy. Accuracy is achieved by generating synthetic data using a carefully selected set of statistical properties of the original data which balance privacy loss with relevance to the given query workload. An important contribution of our framework is an extension of standard differential privacy to multiple tables