Publications

Differentially Private Statistical Inference through $eta$-Divergence One Posterior Sampling

Published in arXiv, 2023

Differential privacy guarantees allow the results of a statistical analysis involving sensitive data to be released without compromising the privacy of any individual taking part. Achieving such guarantees generally requires the injection of noise, either directly into parameter estimates or into the estimation process. Instead of artificially introducing perturbations, sampling from Bayesian posterior distributions has been shown to be a special case of the exponential mechanism, producing consistent, and efficient private estimates without altering the data generative process. The application of current approaches has, however, been limited by their strong bounding assumptions which do not hold for basic models, such as simple linear regressors. To ameliorate this, we propose $\beta$D-Bayes, a posterior sampling scheme from a generalised posterior targeting the minimisation of the $\beta$-divergence between the model and the data generating process. This provides private estimation that is generally applicable without requiring changes to the underlying model and consistently learns the data generating parameter. We show that $\beta$D-Bayes produces more precise inference estimation for the same privacy guarantees, and further facilitates differentially private estimation via posterior sampling for complex classifiers and continuous regression models such as neural networks for the first time.

Recommended citation: Jewson, J., Ghalebikesabi, S., & Holmes, C. (2023). Differentially Private Statistical Inference through $\beta $-Divergence One Posterior Sampling. arXiv preprint arXiv:2307.05194. https://arxiv.org/abs/2307.05194

A Rigorous Link between Deep Ensembles and (Variational) Bayesian Methods

Published in arXiv, 2023

We establish the first mathematically rigorous link between Bayesian, variational Bayesian, and ensemble methods. A key step towards this it to reformulate the non-convex optimisation problem typically encountered in deep learning as a convex optimisation in the space of probability measures. On a technical level, our contribution amounts to studying generalised variational inference through the lense of Wasserstein gradient flows. The result is a unified theory of various seemingly disconnected approaches that are commonly used for uncertainty quantification in deep learning–including deep ensembles and (variational) Bayesian methods. This offers a fresh perspective on the reasons behind the success of deep ensembles over procedures based on parameterised variational inference, and allows the derivation of new ensembling schemes with convergence guarantees. We showcase this by proposing a family of interacting deep ensembles with direct parallels to the interactions of particle systems in thermodynamics, and use our theory to prove the convergence of these algorithms to a well-defined global minimiser on the space of probability measures.

Recommended citation: Wild, V. D., Ghalebikesabi, S., Sejdinovic, D., & Knoblauch, J. (2023). " A Rigorous Link between Deep Ensembles and (Variational) Bayesian Methods." arXiv preprint arXiv:2305.15027. https://arxiv.org/abs/2305.15027

Differentially Private Diffusion Models Generate Useful Synthetic Images (Best Student Paper Award at FL’IJCAI 2023)

Published in arXiv, 2023

The ability to generate privacy-preserving synthetic versions of sensitive image datasets could unlock numerous ML applications currently constrained by data availability. Due to their astonishing image generation quality, diffusion models are a prime candidate for generating high-quality synthetic data. However, recent studies have found that, by default, the outputs of some diffusion models do not preserve training data privacy. By privately fine-tuning ImageNet pre-trained diffusion models with more than 80M parameters, we obtain SOTA results on CIFAR-10 and Camelyon17 in terms of both FID and the accuracy of downstream classifiers trained on synthetic data. We decrease the SOTA FID on CIFAR-10 from 26.2 to 9.8, and increase the accuracy from 51.0% to 88.0%. On synthetic data from Camelyon17, we achieve a downstream accuracy of 91.1% which is close to the SOTA of 96.5% when training on the real data. We leverage the ability of generative models to create infinite amounts of data to maximise the downstream prediction performance, and further show how to use synthetic data for hyperparameter tuning. Our results demonstrate that diffusion models fine-tuned with differential privacy can produce useful and provably private synthetic data, even in applications with significant distribution shift between the pre-training and fine-tuning distributions.

Recommended citation: Ghalebikesabi, S., Berrada, L., Gowal, S., Ktena, I., Stanforth, R., Hayes, J., ... & Balle, B. (2023). " Differentially private diffusion models generate useful synthetic images. " arXiv preprint arXiv:2302.13861.. https://arxiv.org/abs/2302.13861

Bias Mitigated Learning from Differentially Private Synthetic Data: A Cautionary Tale

Published in Uncertainty in Artificial Intelligence 2022 (Oral), 2021

Increasing interest in privacy-preserving machine learning has led to new models for synthetic private data generation from undisclosed real data. However, mechanisms of privacy preservation introduce artifacts in the resulting synthetic data that have a significant impact on downstream tasks such as learning predictive models or inference. In particular, bias can affect all analyses as the synthetic data distribution is an inconsistent estimate of the real-data distribution. We propose several bias mitigation strategies using privatized likelihood ratios that have general applicability to differentially private synthetic data generative models. Through large-scale empirical evaluation, we show that bias mitigation provides simple and effective privacy-compliant augmentation for general applications of synthetic data. However, the work highlights that even after bias correction significant challenges remain on the usefulness of synthetic private data generators for tasks such as prediction and inference.

Recommended citation: S. Ghalebikesabi, H. Wilde, J. Jewson, S. Vollmer, A. Doucet, C. Holmes (2021). " Bias Mitigated Learning from Differentially Private Synthetic Data: A Cautionary Tale." arXiv preprint arXiv:2108.10934.. https://arxiv.org/pdf/2108.10934.pdf

On Locality of Local Explanation Models

Published in NeurIPS 2021, 2021

Shapley values provide model agnostic feature attributions for model outcome at a particular instance by simulating feature absence under a global population distribution. The use of a global population can lead to potentially misleading results when local model behaviour is of interest. Hence we consider the formulation of neighbourhood reference distributions that improve the local interpretability of Shapley values. By doing so, we find that the Nadaraya-Watson estimator, a well-studied kernel regressor, can be expressed as a self-normalised importance sampling estimator. Empirically, we observe that Neighbourhood Shapley values identify meaningful sparse feature relevance attributions that provide insight into local model behaviour, complimenting conventional Shapley analysis. They also increase on-manifold explainability and robustness to the construction of adversarial classifiers.

Recommended citation: S. Ghalebikesabi, L. Ter-Minassian, K. Díaz-Ordaz, C. Holmes (2021). "On Locality of Local Explanation Models." 35th Conference on Neural Information Processing Systems (NeurIPS 2021). 1(2). https://arxiv.org/pdf/2106.14648.pdf

Identification of Underlying Disease Domains by Longitudinal Latent Factor Analysis for Secukinumab Treated Patients in Psoriatic Arthritis and Rheumatoid Arthritis Trials

Published in ACR Convergence 2021, 2021

Secukinumab is a fully monoclonal antibody approved for the treatment of several related autoinflammatory diseases, including psoriasis, psoriatic arthritis (PsA) and ankylosing spondylitis.1 While a single clinical endpoint may be chosen to evaluate treatment effect, the natural extension of this sets out to capture a clinical trials entire longitudinal response profile, made up of multifaceted signs and symptoms. The objective of this analysis is to characterize disease progression and treatment response to secukinumab, across a wide range of clinical variables, thereby complementing traditional analyses of standard endpoints in PsA and rheumatoid arthritis (RA).

Recommended citation: Zhu X, Falck F, Ghalebikesabi S, Kormaksson M, Vandemeulebroecke M, Zhang C, Santos L, Hei Kwok C, West D, Mallon A, Martin R, Readie A, Gandhi K, Ligozio G, Nicholson G. (2021). "Identification of Underlying Disease Domains by Longitudinal Latent Factor Analysis for Secukinumab Treated Patients in Psoriatic Arthritis and Rheumatoid Arthritis Trials." ACR Convergence 2021. 1(2). https://acrabstracts.org/abstract/identification-of-underlying-disease-domains-by-longitudinal-latent-factor-analysis-for-secukinumab-treated-patients-in-psoriatic-arthritis-and-rheumatoid-arthritis-trials/

Quasi-Bayesian nonparametric density estimation via autoregressive predictive updates

Published in Uncertainty in Artififial Intelligence 2023 (Spotlight), 2021

Bayesian methods are a popular choice for statistical inference in small-data regimes due to the regularization effect induced by the prior, which serves to counteract overfitting. In the context of density estimation, the standard Bayesian approach is to target the posterior predictive. In general, direct estimation of the posterior predictive is intractable and so methods typically resort to approximating the posterior distribution as an intermediate step. The recent development of recursive predictive copula updates, however, has made it possible to perform tractable predictive density estimation without the need for posterior approximation. Although these estimators are computationally appealing, they tend to struggle on non-smooth data distributions. This is largely due to the comparatively restrictive form of the likelihood models from which the proposed copula updates were derived. To address this shortcoming, we consider a Bayesian nonparametric model with an autoregressive likelihood decomposition and Gaussian process prior, which yields a data-dependent bandwidth parameter in the copula update. Further, we formulate a novel parameterization of the bandwidth using an autoregressive neural network that maps the data into a latent space, and is thus able to capture more complex dependencies in the data. Our extensions increase the modelling capacity of existing recursive Bayesian density estimators, achieving state-of-the-art results on tabular data sets. .

Recommended citation: Ghalebikesabi, S., Holmes, C., Fong, E., & Lehmann, B. (2022). Quasi-Bayesian nonparametric density estimation via autoregressive predictive updates. Uncertainty in Artificial Intelligence (pp. 658-668). PMLR. https://arxiv.org/abs/2206.06462

Deep Generative Pattern-Set Mixture Models for Nonignorable Missingness Imputation

Published in Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS), 2021

We propose a variational autoencoder architecture to model both ignorable and nonignorable missing data using pattern-set mixtures as proposed by Little (1993). Our model explicitly learns to cluster the missing data into missingness pattern sets based on the observed data and missingness masks. Underpinning our approach is the assumption that the data distribution under missingness is probabilistically semi-supervised by samples from the observed data distribution. Our setup trades off the characteristics of ignorable and nonignorable missingness and can thus be applied to data of both types. We evaluate our method on a wide range of data sets with different types of missingness and achieve state-of-the-art imputation performance. Our model outperforms many common imputation algorithms, especially when the amount of missing data is high and the missingness mechanism is non-ignorable.

Recommended citation: S. Ghalebikesabi, R. Cornish, L. Kelly, C. Holmes (2021). " Deep Generative Pattern-Set Mixture Models." Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS). http://proceedings.mlr.press/v130/ghalebikesabi21a/ghalebikesabi21a.pdf