Notes

Introduction

  • Federated learning was initially presented as a fully privacy-preserving decentralized learning paradigm, never sharing raw data between participants and the server.

 

  • Soon after its proposal, attacks designed to extract information from the transmitted local updates were developed, the most notable being Deep Leakage from Gradients1, but working efficiently on small-sized inputs only.

  • The authors propose a new type of attack, Generative Gradient Leakage, leveraging external data server-side to train a generative model used to regularize a gradient inversion attack.

 

  • They show promising results attacking federated frameworks training on ImageNet and CelebA, reconstructing a large image from a gradient.

Overview

 

Threat model

  • Honest-but-curious server. It behaves normally, but will try its best to infer private information from what is accessible to it, models and degraded local updates.
  • Participants apply a variety of local defense mechanisms,
  • Access to external data server-side from the same distribution as participants.

Problem definition

Notations

  • \(x\in\mathbb{R}^{h\times w}\) an image with associated class label \(c\),
  • \(F(x)=\nabla_{\theta}\mathcal{L}(f_{\theta}(x),c)\) the gradient of \(x\) with respect to parameters \(\theta\in\mathbb{R}^p\) of a network \(f_{\theta}\),
  • \(y=\mathcal{T}(F(x))\) a gradient degraded by the defence transformation \(\mathcal{T}\),
  • \(w: \mathbb{R}^{h\times w}\rightarrow \mathbb{R}\) an image prior defining the likelihood of an image,
  • \(\mathcal{D}: \mathbb{R}^p\times\mathbb{R}^p\rightarrow \mathbb{R}\) a distance metric between gradients,
  • \(G: \mathbb{R}^k\rightarrow \mathbb{R}^{h\times w}\) a generative network trained on external data server-side,
  • \(\mathcal{R}(G;z)\) a regularization term, penalizing latent vector \(z\) when deviating from the prior distribution.

Previous formulation1

\[x^* = \underset{x\in\mathbb{R}^{h\times d}}{argmin\ } \underbrace{\mathcal{D}(y, F(x))}_\text{gradient matching loss} + \underbrace{\lambda \omega(x)}_\text{regularization}\]

Novel formulation

\[z^* = \underset{z\in\mathbb{R}^{k}}{argmin\ } \underbrace{\mathcal{D}(y, \mathcal{T}(F(G(z))))}_\text{gradient matching loss} + \underbrace{\lambda \mathcal{R}(G; z)}_\text{regularization}\]

Method

Analytical label inference

For networks with a fully connected classification head using sigmoid or ReLu activation functions trained with cross-entropy loss on one-hot labels, the label can be inferred through the index of the negative entry of the gradient with respect to this layer.

Gradient Transformation Estimations

Gradient Clipping: Given a perturbation \(\mathcal{T}(w, S) = w/\text{max}(1,\frac{||w||_2}{S})\), estimate the clipping bound \(S\) as the \(l_2\) norm at each layer,

Gradient Sparsification: Given a perturbation \(\mathcal{T}(w,p) = y \odot \mathcal{M}\) with \(\mathcal{M}\) a mask with pruning rate \(p\), simply estimate \(\mathcal{M}\) using the zero entries of the gradient.

Gradient Matching Losses

Squared \(l_2\) norm

\[\mathcal{D}(y, \tilde{y}) = ||y - \tilde{y}||^2_2\]

Cosine similarity

\[\mathcal{D}(y, \tilde{y}) = 1 - \frac{<y,\tilde{y}>}{||y||_2||\tilde{y}||_2}\]

Regularization Terms

KL-based regularization (with \(\mu_i\) and \(\sigma_i\) the element-wise mean and standard deviation)

\[\mathcal{R}(G;z)=-\frac{1}{2}\sum_{i=1}^k(1+\text{log}(\sigma_i^2)-\mu_i^2-\sigma_i^2)\]

Norm-based regularization

\[\mathcal{R}(G;z) = (||z||^2_2-k)^2\]

Optimization Strategies

Gradient-based

  • Adam

Gradient-free

  • Bayesian optimization (trust region BO (TuRBO))
  • Covariance matrix Adaptation evolution strategy (CMA-ES)

Experiments

Federated learning and attack setup

Datasets

  • Binary Gender Classification with CelebA, 32x32,
  • 1000-classes Image Classification with ImageNet ILSVRC 2012, 224x224.

Models

  • Federated : Resnet-18,
  • Generative model : DCGAN for CelebA, BigGAN for ImageNet.

Data split

  • GAN trained on training sets of CelebA (162k) and ImageNet,
  • FL task performed on the validation sets.

Evaluation metrics

  • Mean Square Error - Image Space (MSE-I),
  • Peak Signal-to-Noise Ratio (PSNR),
  • Learned Perceptual Image Patch Similarity (LPIPS) (measured by a VGG network),
  • Mean Square Error - Representation Space (MSE-R) (the feature vector before the final classification layer)

Defence schemes

  • Additive Noise: inject a Gaussian noise \(\epsilon \sim \mathcal{N}(0,\sigma^2I)\) to the gradients with \(\sigma = 0.1\),
  • Gradient Clipping: clip the values of the gradients with a bound of \(S = 4\),
  • Gradient Sparsification: perform magnitude-based pruning on the gradients to achieve 90% sparsity,
  • Soteria: gradients are generated on the perturbed representation with a pruning rate of 80%.

Results

Losses and regularizations

  • Squared \(l_2\) norm with KLD regularization used in further experiments.  

Optimizers

  • The numbers of updates are set to 2500, 1000, and 800 for Adam, BO and CMA-ES respectively.
  • Adam works better for CelebA (aligned images), but struggles for ImageNet, highly dependent of the initialization.
  • CMA-ES works better for ImageNet, finding good latent vectors in the high dimensional latent space of the trained GAN.  

Comparison to SOTA and effect of defences

  • Clearly outperforms previous reconstruction attacks on all metrics and most defence mechanisms,
  • It is limited by the trained GAN, requiring a high amount of public or “leaked” data and hard-constraining the reconstructed image to be generatable by GAN.  

Conclusion

  • The authors show promising results attacking federated frameworks training on ImageNet and CelebA, reconstructing a large image from a gradient.

  • This work must be seen as a valuable proof of concept: the assumptions are extremely strong, limiting its applicability in real contexts, but the idea is quite appealing, opening a whole new spectrum of gradient inversion techniques.

References