Di Wang
PHD candidate
Department of Computer Science and Engineering
State University of New York at Buffalo
Email :
dwang45 "at" buffalo.edu,
shao3wangdi "at" berkeley.edu
Google Scholar LinkedIn
I am a fourth year PHD student in the
Department of Computer Science and Engineering at
The State University of New York (SUNY) at Buffalo under supervision of
Dr. Jinhui Xu . Before that I got
my Master degree in Mathematics at
University of Western Ontario in 2015, and I got my Bachelor degree in Mathematics and Applied
Mathematics at
Shandong University in 2014.
Generally speaking, I am interested in Private Data Analysis, Machine Learning and Algorithmic Fairness.
Specifically, my research contains differential privacy, private machine learning, privacypreserving data mining and crowdsourcing, robust estimation, large scale/ high dimension optimization, adversarial machine learning, fairness in machine learning.
Main type of problems I am working on are (Locally) Differentially Private Empirical Risk Minimization , Differentially Private High Dimensional Statistics and Private Crowdscourcing.
 May'19 Three papers have been accepted to IJCAI 2019!
 April'19 Two papers have been accepted to ICML 2019!
 December'18: One paper has been accepted to ALT 2019!

December'18: I have revieved UB CSE Best Graduate Research Award 2018!
 My most recent resume (last updated in May, 2019) can be found here.
 On Differentially Private Empirical Risk Minimization with Heavytailed Data. Abstract▼
Di Wang and Jinhui Xu .
 Learning Halfspaces in Noninterative Local Differential Privacy Model with Public Unlabeled Data. Abstract▼
Di Wang*, Huanyu Zhang* and Jinhui Xu . (* equal contributions)
 Inferring Ground Truth From Crowdcourcing Data Under Local Differential Privacy. Abstract▼
Di Wang*, Mengdi Huai*, Chenglin Miao, Jinhui Xu , Aidong Zhang(* equal contribution)
 Pairwise Learning with Differential Privacy Guarantee. Abstract▼
Mengdi Huai*, Di Wang*, Chenglin Miao, Jinhui Xu , Aidong Zhang(* equal contribution)
 Differentially Private Facility Location Problem Revisited. Abstract▼
Yunus Esencayi, Marco Gaboardi, Shi Li and Di Wang (alphabetic order)
 Estimating GLM in Noninterative Local Differential Privacy Model with Public Unlabeled Data. Abstract▼
Di Wang*, Huanyu Zhang*, Marco Gaboardi and Jinhui Xu . (* equal contributions)
 Robust Estimation of High Dimensional Mixture Model via Trimmed EM Algorithm. Abstract▼
In this paper, we study the problem of estimating latent variable models with arbitrarily corrupted samples in the high dimensional case, {\em i.e.,} $d\gg n$, where the underlying parameter is sparse. Specifically, we propose a method called Trimmed (Gradient) Expectation Maximization which attaches a trimming gradients and hard thresholding step to the Expectation step (Estep) and Maximization step (Mstep), respectively. Particularly, under some mild assumptions, with an appropriate initialization, we show that the algorithm is corruptionfree and converges to the (near) optimal statistical rate geometrically when the fraction of corruption samples satisfies $\alpha\leq O(\frac{1}{\sqrt{n}})$. Moreover, we implement our general framework to three canonical examples: mixture of Gaussians, mixture of regressions and linear regression with missing covariates. Experiments also support our theoretical analysis.
Di Wang*, Xiangyu Guo*, Shi Li and Jinhui Xu . (* equal contributions)
 Differentially Private High Dimensional Sparse Covariance Matrix Estimation Abstract▼
In this paper, we study the problem of estimating the covariance matrix under differential privacy, where the underlying covariance matrix is assumed to be sparse and of high dimensions. We propose a new method, called DPThresholding, to achieve a nontrivial $\ell_2$norm based error bound,
which is significantly better than the existing ones from adding noise directly to the empirical covariance matrix. We also extend the $\ell_2$norm based error bound to a general $\ell_w$norm based one for any $1\leq w\leq \infty$, and show that they share the same upper bound asymptotically. Our approach can be easily extended to local differential privacy. Experiments on the synthetic datasets show consistent results with our theoretical claims.
Di Wang and Jinhui Xu.
Submitted.
 On the Emprical Risk Minimization In the Noninteractive Local Differential Privacy Model Abstract▼
In this paper, we study the Empirical Risk Minimization (ERM) problem in the noninteractive Local Differential Privacy (LDP) model. We first show that if the loss function is $(\infty, T)$smooth, by using the Bernstein polynomial approximation we can avoid a dependency of the sample complexity, to achieve error $\alpha$, on the exponential of the dimensionality $p$ with base $1/\alpha$ ({\em i.e.,} $\alpha^{p}$).
This answers a question from (Smith et.al., 2017). Then, we propose playerefficient algorithms with $1$bit communication complexity and $O(1)$ computation cost for each player. The error bound of these algorithms is asymptotically the same as the original one.
With some additional assumptions, we also give an algorithm which is more efficient for the server.
Based on different types of polynomial approximations, we propose (efficient) noninteractive locally differential private algorithms for learning the set of kway marginal queries and the set of smooth queries.
Moreover, we study the case of $1$Lipschitz generalized linear convex loss functions and show that there is an $(\epsilon, \delta)$LDP algorithm whose sample complexity for achieving error $\alpha$ is only linear in the dimensionality $p$ and quasipolynomial in other terms. To prove this, we first show that the conclusion holds for the hinge loss function. Then, we extend the result to any $1$Lipschitz generalized linear convex loss functions by showing that every such a function can be approximated by a linear combination of hinge loss functions and some linear functions. Our results use a polynomial of inner product approximation technique. Then we apply our technique to the Euclidean median problem and show that its sample complexity needs only to be quasipolynomial in $p$, which is the first result with a subexponential sample complexity in $p$ for nongeneralized linear loss functions.
Di Wang, Marco Gaboardi, Adam Smith and Jinhui Xu .
Submitted.
 Faster Large Scale Constrained Linear Regression via TwoStep Preconditioning Abstract▼
In this paper, we study the large scale constrained linear regression problem and propose a twostep preconditioning method, which is based on some recent developments on random projection, sketching techniques and convex optimization methods. Combining the method with (accelerated) minibatch SGD, we can achieve an approximate solution with a time complexity lower than that of
the stateoftheart techniques for the low precision case. Our idea can also be extended to the high precision case, which gives an alternative implementation to the Iterative Hessian Sketch (IHS) method with significantly improved time complexity.
Experiments on benchmark and synthetic datasets suggest that our methods indeed outperform existing ones considerably in both the low and high precision cases.
Di Wang and Jinhui Xu .
Submitted.
 Gradient Complexity and Nonstationary Views of Differentially Private Empirical Risk Minimization Abstract▼
In this paper we study the Differentially Private Empirical Risk Minimization (DPERM) problem in different settings from time complexity and nonstationary views. For the time complexity view; for smooth (strongly) convex loss function with or without (non)smooth regularization, we give algorithms that achieve either optimal or near optimal utility bounds with less gradient complexity compared with previous work. For ERM with smooth convex loss function in highdimensional ($p\gg n$) setting, we give an algorithm which achieves the upper bound with less gradient complexity than previous ones. At last, we generalize the expected excess empirical risk from convex loss functions to nonconvex ones satisfying the PolyakLojasiewicz condition. For the nonstationary view, we study DPERM with nonconvex loss functions and give several upper bounds for the utility in different settings. We first consider the problem in lowdimensional space. For DPERM with nonsmooth regularizer, we generalize an existing work by measuring the utility using $\ell_2$ norm of the projected gradient. Also, we extend the error bound measurement, for the first time, from empirical risk to population risk by using the expected $\ell_2$ norm of the gradient. We then investigate the problem in high dimensional space, and show that by measuring the utility with FrankWolfe gap, it is possible to bound the utility by the Gaussian Width of the constraint set, instead of the dimensionality $p$ of the underlying space. We further demonstrate that the advantages of this result can be achieved by the measure of $\ell_2$ norm of the projected gradient. We also show that the utility of some special nonconvex loss functions can be reduced to a level ({\em i.e.,} depending only on $\log p$) similar to that of convex loss functions.
Di Wang and Jinhui Xu .
Submitted.
 Tight Lower Bound of Sparse Covariance Matrix Estimation in the Local Differential Privacy Abstract▼
In this paper, we study the sparse covariance matrix estimation problem in the local differential privacy model, and give a lower bound of $\Omega(\frac{s^2\log p}{n\epsilon^2})$ on the $\epsilon$ noninteractive private minimax risk in the metric of squared spectral norm, where $s$ is the row sparsity of the underlying covariance matrix, $n$ is the sample size, and $p$ is the dimensionality of the data.
We show that the lower bound is actually tight, as it matches a previous upper bound.
Our main technique for achieving this lower bound is
a general framework, called {\bf General Private Assouad Lemma}, which is a considerable generalization of the previous private Assouad lemma and can be used as a general method for bounding the private minimax risk of matrixrelated estimation problems.
Di Wang and Jinhui Xu .
28th International Joint Conference on Artificial Intelligence (IJCAI 2019).
 Locally Differentially Private Principal Component Analysis Abstract▼
In this paper, we study the Principal Component Analysis (PCA) problem under the (distributed) noninteractive local differential privacy model. For the low dimensional case ({\em i.e.,} $p \ll n$), we show the optimal rate of $\Theta(\frac{kp}{n\epsilon^2})$ (omitting the eigenvalue terms) for the private minimax risk of the $k$dimensional PCA using the squared subspace distance as the measurement, where $n$ is the sample size and $\epsilon$ is the privacy parameter. For the high dimensional ({\em i.e.,} $p\gg n$) row sparse case, we first give a lower bound of $\Omega(\frac{ks\log p }{n\epsilon^2})$ on the private minimax risk, where $s$ is the underlying sparsity parameter. Then we provide an efficient algorithm to achieve a near optimal upper bound of $O(\frac{s^2\log p}{n\epsilon^2})$. Experiments on both synthetic and real world datasets confirm the theoretical guarantees of our algorithms.
Di Wang and Jinhui Xu .
28th International Joint Conference on Artificial Intelligence (IJCAI 2019).
 Privacyaware Synthesizing for Crowdsourced Data Abstract▼
The prevalence of the World Wide Web enables the data collectors to easily share their data that are collected from a large crowd of users with the public. Although releasing crowdsourced data on the web brings many benefits to the data analyzers to conduct statistical analysis, it may violate crowd users' data privacy, which has been a major obstacle to release the crowdsourced data. A potential way to address this problem is to employ the traditional differential privacybased mechanisms and perturb the data with some noise before publishing them. However, such noise perturbation mechanisms offer little data utility because crowdsourced datasets contain massive amounts of conflicting data records with large domains. In particular, this degraded utility results from two respects: firstly, the originally collected crowdsourced data usually contain conflicting data; secondly, the noise needed to guarantee differential privacy may be proportional to the number of data records or the domain of the input data, which renders the released crowdsourced data useless. To address the above challenges, we propose a novel privacyaware synthesizing method for crowdsourced data. In this method, the data collectors first learn the underlying distributions of the crowdsourced data through taking each user's fine grained reliability degrees into account. Then, a set of candidate synthetics are sampled from the learned distributions. Finally, these sampled candidate synthetics are subjected to privacy tests, and only those candidate synthetics which pass privacy tests are allowed to be safely released. The proposed method not only provides strong privacy protection for individual users but also generates high utility synthetic data. The desirable performance of the proposed method is verified via theoretical analysis and extensive experiments conducted on both realworld and synthetic datasets.
Mengdi Huai, Di Wang, Chenglin Miao, Jinhui Xu , Aidong Zhang.
28th International Joint Conference on Artificial Intelligence (IJCAI 2019).
 Differentially Private Empirical Risk Minimization with Nonconvex Loss Functions Abstract▼
We study the problem of Empirical Risk Minimization (ERM) with (smooth) nonconvex loss functions under the differentialprivacy (DP) model. Existing approaches for this problem mainly adopt gradient norms to measure the error, which in general cannot guarantee the quality of the solution. To address this issue,
we first study the expected excess empirical (or population) risk, which was primarily used as the utility to measure the quality for convex loss functions. Specifically, we show that
the excess empirical (or population) risk can be upper bounded by $\tilde{O}(\frac{d\log (1/\delta)}{\log n\epsilon^2})$ in the $(\epsilon, \delta)$DP settings, where $n$ is the data size and $d$ is the dimensionality of the space.
The $\frac{1}{\log n}$ term in the empirical risk bound can be further improved to $\frac{1}{n^{\Omega(1)}}$ (when $d$ is a constant) by a highly nontrivial analysis on the timeaverage error.
To obtain more efficient solutions, we also consider the connection between achieving differential privacy and finding approximate local minimum.
Particularly, we show that when the size $n$ is large enough, there are $(\epsilon, \delta)$DP algorithms which can find an approximate local minimum of the empirical risk with high probability in both the constrained and nonconstrained settings.
These results indicate that one can escape saddle points privately.
Di Wang, Changyou Chen and Jinhui Xu .
36th International Conference on Machine Learning (ICML 2019).
 On the Sparse Linear Regression Under Local Differential Privacy Abstract▼
In this paper, we study the sparse linear regression problem under the Local Differential Privacy (LDP) model. We first show that polynomial dependency on the dimensionality $p$ of the space is unavoidable for the estimation error in both noninteractive and sequential interactive local models, if the privacy of the whole dataset needs to be preserved. Similar limitations also exist for other types of error measurements and in the relaxed local models. This indicates that differential privacy in high dimensional space is unlikely achievable for the problem. With the understanding of this limitation, we then present two algorithmic results. The first one is
a sequential interactive LDP algorithm for the low dimensional sparse case, called Locally Differentially Private Iterative Hard Thresholding (LDPIHT), which achieves a near optimal upper bound. This algorithm is actually rather general and can be used to solve quite a few other problems, such as (Local) DPERM with sparsity constraints and sparse regression with nonlinear measurements. The second one is for the restricted (high dimensional) case where only the privacy of the responses (labels) needs to be preserved. For this case,
we show that the optimal rate of the error estimation can be made logarithmically depending on $p$ (i.e., $\log p$) in the local model,
where an upper bound is obtained by a labelprivacy version of LDPIHT. Experiments on real world and synthetic datasets confirm our theoretical analysis.
Di Wang and Jinhui Xu .
36th International Conference on Machine Learning (ICML 2019). Selected as Long Talk.
 Estimating Sparse Covariance Matrix Under Differential Privacy via Thresholding Abstract▼
In this paper, we study the problem of estimating the covariance matrix under differential privacy, where the underlying covariance matrix is assumed to be sparse and of high dimensions. We propose a new method, called DPThresholding, to achieve a nontrivial $\ell_2$norm based error bound,
which is significantly better than the existing ones from adding noise directly to the empirical covariance matrix. Experiments on the synthetic datasets show consistent results with our theoretical claims.
Di Wang, Jinhui Xu and Yang He.
53rd Annual Conference on Information Sciences and Systems (CISS 2019).
 Noninteractive Locally Private Learning of Linear Models via Polynomial Approximations Abstract▼
In this paper, we study the Empirical Risk Minimization problem in the noninteractive Local Differential Privacy (LDP) model. First, we show that for the hinge loss function, there is an $(\epsilon, \delta)$LDP algorithm whose sample complexity for achieving an error of $\alpha$ is only linear in the dimensionality $p$ and quasipolynomial in other terms. Then, we extend the result to any $1$Lipschitz generalized linear convex loss functions by showing that every such function can be approximated by a linear combination of hinge loss functions and some linear functions. Finally, we apply our technique to the Euclidean median problem and show that its sample complexity needs only to be quasipolynomial in $p$, which is the first result with a subexponential sample complexity in $p$ for nongeneralized linear loss functions. Our results are based on a technique, called polynomial of inner product approximation, which may be applicable to other problems.
Di Wang, Adam Smith and Jinhui Xu .
The 30th International Conference on Algorithmic Learning Theory (ALT 2019).
 Differentially Private Empirical Risk Minimization with Smooth Nonconvex Loss Functions: A Nonstationary View. Abstract▼
In this paper, we study the Differentially Private Empirical Risk Minimization (DPERM) problem with nonconvex loss functions and give several upper bounds for the utility in different settings. We first consider the problem in lowdimensional space. For DPERM with nonsmooth regularizer, we generalize an existing work by measuring the utility using $\ell_2$ norm of the projected gradient. Also, we extend the error bound measurement, for the first time, from empirical risk to population risk by using the expected $\ell_2$ norm of the gradient. We then investigate the problem in high dimensional space, and show that by measuring the utility with FrankWolfe gap, it is possible to bound the utility by the Gaussian Width of the constraint set, instead of the dimensionality $p$ of the underlying space. We further demonstrate that the advantages of this result can be achieved by the measure of $\ell_2$ norm of the projected gradient. A somewhat surprising discovery is that although the two kinds of measurements are quite different, their induced utility upper bounds are asymptotically the same under some assumptions. We also show that the utility of some special nonconvex loss functions can be reduced to a level ({\em i.e.,} depending only on $\log p$) similar to that of convex loss functions. Finally, we test our proposed algorithms on both synthetic and real world datasets and the experimental results confirm our theoretical analysis.
Di Wang and Jinhui Xu .
ThirtyThird AAAI Conference on Artificial Intelligence (AAAI 2019). Selected as Oral Presentation.
 Empirical Risk Minimization in Noninteractive Local Differential Privacy Revisited.
Abstract▼
In this paper, we revisit the Empirical Risk Minimization problem in the noninteractive local model of differential privacy. In the case of constant or low dimensions ($p\ll n$), we first show that if the loss function is $(\infty, T)$smooth, we can avoid a dependence of the sample complexity, to achieve error $\alpha$, on the exponential of the dimensionality $p$ with base $1/\alpha$ ({\em i.e.,} $\alpha^{p}$),
which answers a question in \cite{smith2017interaction}. Our approach is based on polynomial approximation. Then, we propose playerefficient algorithms with $1$bit communication complexity and $O(1)$ computation cost for each player. The error bound is asymptotically the same as the original one. With some additional assumptions, we also give an efficient algorithm for the server.
In the case of high dimensions ($n\ll p$),
we show that if the loss function is a convex generalized linear function, the error can be bounded by using the Gaussian width of the constrained set, instead of $p$, which improves the one in
Smith et al.
Our techniques can be extended to some related problems, such as $k$way marginal queries and smooth queries.
Di Wang, Marco Gaboardi and Jinhui Xu .
Conference on Neural Information Processing Systems (NIPS/NeurIPS), 2018.
 Differentially Private Sparse Inverse Covariance Estimation.
Abstract▼
In this paper, we give the first study of sparse inverse covariance estimation problem under differential privacy. Firstly, we propose an $\epsilon$differentially private algorithm via output perturbation, which is based on the sensitivity of the optimization problem and Wishart mechanism. Based on the idea of that, we propose a general covariance perturbation method, and then for $\epsilon$differential privacy, we analyze Laplacian and Wishart mechanisms, for $(\epsilon,\delta)$differential privacy we analyze Gaussian and Wishart mechanisms. Moreover, we extend the covariance perturbation algorithm to distributed setting and local differential privacy. Experiments on synthetic and benchmark datasets are also support our theoretical analysis.
Di Wang, Mengdi Huai and Jinhui Xu .
2018 6th IEEE Global Conference on Signal and Information Processing (2018 GlobalSip). Selected as Oral Presentation.
 Large Scale Constrained Linear Regression Revisited: Faster Algorithms via Preconditioning.
Abstract▼
In this paper, we revisit the largescale constrained linear regression problem and propose faster methods based on some recent developments in sketching and optimization.
Our algorithm combines minibatch SGD with a new method called twostep preconditioning to achieve an $\epsilon$accuracy solution for the low precision case, and has a lower time complexity than the stateoftheart techniques. Our idea can also be extended to the high precision case, which gives an alternative implementation to the Iterative Hessian Sketch (IHS) method with significantly improved time complexity.
Experiments on benchmark and synthetic datasets suggest that our methods indeed outperform existing ones considerably in both the low and high precision cases.
Di Wang and Jinhui Xu .
ThirtySecond AAAI Conference on Artificial Intelligence (AAAI 2018). Selected as Oral Presentation.
 Differentially Private Empirical Risk Minimization Revisited: Faster and More General.
Abstract▼
In this paper we study the differentially private Empirical Risk Minimization (ERM) problem in different settings. For smooth (strongly) convex loss function with or without (non)smooth regularization, we give algorithms that achieve either optimal or near optimal utility bounds with less gradient complexity compared with previous work. For ERM with smooth convex loss function in highdimensional ($p\gg n$) setting, we give an algorithm which achieves the upper bound with less gradient complexity than previous ones. At last, we generalize the expected excess empirical risk from convex loss functions to nonconvex ones satisfying the PolyakLojasiewicz condition and give a tighter upper bound on
the utility than the one in \cite{DBLP:journals/corr/ZhangZMW17}.
Di Wang, Minwei Ye and Jinhui Xu .
Conference on Neural
Information Processing Systems (NIPS/NeurIPS), 2017.
 High Dimensional Sparse Linear Regression under Local Differential Privacy: Power and Limitations Abstract▼
In this paper, we study high dimensional sparse linear regression under the Local Differential Privacy (LDP) model, and give both negative and positive results. On the negative side, we show that polynomial dependency on the dimensionality $p$ of the space is unavoidable in the estimation error under the noninteractive local model, if the privacy of the whole dataset needs to be preserved. Similar limitations also exist for other types of error measurements and in the (sequential) interactive local or relaxed local models. This indicates that differential privacy in high dimensional space is unlikely achievable for the problem. On the positive side, we show that the optimal rate of the error estimation can be made logarithmically depending on $p$ (i.e., $\log p$) under the local model, if only the privacy of the responses (labels) is to be preserved, where the upper bound is obtained by a new method called Differentially Private Iterative Hard Thresholding (DPIHT), which is interesting in its own right. Our result can also be extended to some nonlinear monotone measurement models while keeping the responses locally private.
Di Wang, Adam Smith and Jinhui Xu .
NIPS 2018 Workshop on Privacy Preserving Machine Learning.

Instructor
 CSE 437/537 Introduction to Machine Learning, Summer 2019 @SUNY at Buffalo.
 Teaching assistant:
 CSE 437/537 Introduction to Machine Learning, Spring 2018 @SUNY at Buffalo.
 CSE 431/531 Analysis of Algorithm, Fall 2017, Spring 2017, Fall 2016, Spring 2016 @SUNY at Buffalo.

CSE 115 Introduction to Computer Science for Majors I, Fall 2015 @ @SUNY at Buffalo.
 MATH 1229A Methods of Matrix Algebra, Summer 2015, Spring 2015 @ UWO.
 ATH 1225B Methods of Calculus, Fall 2014 @ UWO.
 Reviewer
NeuIPS2019, ICDCS 2019, ICCV 2019, CVPR 2019, ICML 2019, AISTATS 2019, KDD 2018, AAAI 2017 2018, CompIMAGE 2018, IWCIA 2017
ACM Computing Surveys, IEEE Transactions on Information Forensics and Security, IEEE Transactions on Pattern Analysis and Machine Intelligence, Theoretical Computer Science, Information Processing Letters
 Best CSE Graduate Research Award in 2018, SUNY at Buffalo
 ICML Travel Award, 2019

NIPS Travel Award, 2018, 2017

Western Graduate Research Scholarship, Western University, 20142015
 Algebraic Geometry Summer School Scholarship, ENCU, Shanghai, 2013