Detectability in Diversity: Improved Canary Crafting for Privacy Auditing in One Run
Mathieu Dagréou, Aurélien Bellet
Read on arXiv →Key claim
New canary crafting method improves privacy auditing efficiency.
This paper introduces an efficient method for crafting canaries in privacy auditing, which enhances the accuracy of privacy leakage estimates while reducing computational costs. The approach combines influence functions with bilevel optimization to achieve better results than previous methods.
In plain English
This paper introduces an efficient method for crafting canaries in privacy auditing, which enhances the accuracy of privacy leakage estimates while reducing computational costs. The approach combines influence functions with bilevel optimization to achieve better results than previous methods.
The paper proposes a new method for crafting canaries that improves privacy auditing efficiency, extending existing methods.
The experimental results demonstrate stronger privacy leakage estimates with solid methodological support.
Deep reliability assessment
The methodology supports the claim that IBIS can produce more detectable one-run auditing canaries at much lower compute cost in the tested CIFAR-10/WRN16-4 and ResNet9-style settings. Broader claims about reliably tightening DP lower bounds across models, datasets, privacy regimes, or production-scale systems are less supported, especially because several DP epsilon estimates have high variance.
Reproducibility
Yes. The paper says code is attached to the submission, documented, and will be made public upon acceptance; experiments use standard datasets such as CIFAR-10, with details provided in Section D.
Discussion questions
- 1.The core assumption is that canary interference is primarily captured by representation-space similarity or cross-influence; when might this proxy fail, especially for highly non-linear or foundation-model training dynamics?
- 2.For builders auditing real systems, does optimizing artificial canaries reveal meaningful privacy risk for natural user data, or mainly a worst-case stress test of memorization?
- 3.What empirical result would falsify the paper’s thesis: for example, if diverse high-self-influence canaries still underperform random flipped-label canaries on larger architectures, different datasets, or stronger DP-SGD settings?
Key figure
Figure 1 shows that influence-based preselection improves over random canary selection and gives IBIS a stronger initialization, with regularization helping most clearly in the non-private setting.