One of the most fundamental problems in learning theory is to view input data as random samples from an unknown distribution and then to make statistical inferences about the underlying distribution. In this talk, we focus on a notable example of such a statistical task: testing properties of distributions. The goal is to design an algorithm that uses as few samples as possible from a distribution and distinguishes whether the distribution has the property, or it is $\epsilon$-far in $\ell_1$-distance from any distribution which has the property. In this talk, we explore several questions in the framework of distribution testing, such as (i) Is the distribution uniform? Or, is it far from being uniform? (ii) Is a pair of random variables independent or correlated? (iii) Is the distribution monotone? Moreover, we discuss extensions of the standard testing framework to more practical settings. For instance, we consider the case where the sensitivity of the input samples (e.g., patients’ medical records) requires the design of statistical tests that ensure the privacy of individuals. We address this case by designing differentially private testing algorithms for several testing questions with (nearly)-optimal sample complexities.
- Tags
-