Abstract
Data profiling is an important task to understand data semantics and is an essential pre-processing step in many tools. Due to privacy constraints, data is often partitioned into silos, with different access control. Discovering functional dependencies (FDs) usually requires access to all data partitions to find constraints that hold on the whole dataset. Simply applying general secure multi-party computation protocols incurs high computation and communication cost. This paper formulates the FD discovery problem in the secure multi-party scenario. We propose secure constructions for validating candidate FDs, and present efficient cryptographic protocols to discover FDs over distributed partitions. Experimental results show that solution is practically efficient over non-secure distributed FD discovery, and can significantly outperform general purpose multi-party computation frameworks. To the best of our knowledge, our work is the first one to tackle this problem.
Original language | English (US) |
---|---|
Pages (from-to) | 184-196 |
Number of pages | 13 |
Journal | Proceedings of the VLDB Endowment |
Volume | 13 |
Issue number | 2 |
DOIs | |
State | Published - 2020 |
Externally published | Yes |
Event | 46th International Conference on Very Large Data Bases, VLDB 2020 - Virtual, Japan Duration: Aug 31 2020 → Sep 4 2020 |
Bibliographical note
Publisher Copyright:© VLDB Endowment.