Research
Working Papers
Reproducible Aggregation of Sample-Split Statistics
(with Joseph P. Romano)
Revision requested by American Economic Review
Last updated: November, 2023
[ Abstract | arXiv ]
Statistical inference is often simplified by sample-splitting. This simplification comes at the cost of the introduction of randomness not native to the data. We propose a simple procedure for sequentially aggregating statistics constructed with multiple splits of the same sample. The user specifies a bound and a nominal error rate. If the procedure is implemented twice on the same data, the nominal error rate approximates the chance that the results differ by more than the bound. We analyze the accuracy of the nominal error rate and illustrate the application of the procedure to several widely applied statistical methods.
Journal Articles
Semiparametric Estimation of Long-Term Treatment Effects
(with Jiafeng Chen)
Journal of Econometrics. 237 (2). December 2023.
[ Abstract | arXiv | Software ]
Long-term outcomes of experimental evaluations are necessarily observed after long delays. We develop semiparametric methods for combining the short-term outcomes of experiments with observational measurements of short-term and long-term outcomes, in order to estimate long-term treatment effects. We characterize semiparametric efficiency bounds for various instances of this problem. These calculations facilitate the construction of several estimators. We analyze the finite-sample performance of these estimators with a simulation calibrated to data from an evaluation of the long-term effects of a poverty alleviation program.
Confidence Intervals for Seroprevalence
(with Thomas J. DiCiccio, Joseph P. Romano, and Azeem M. Shaikh)
Statistical Science. 37 (3). August 2022.
[ Abstract | arXiv ]
This paper concerns the construction of confidence intervals in standard seroprevalence surveys. In particular, we discuss methods for constructing confidence intervals for the proportion of individuals in a population infected with a disease using a sample of antibody test results and measurements of the test's false positive and false negative rates. We begin by documenting erratic behavior in the coverage probabilities of standard Wald and percentile bootstrap intervals when applied to this problem. We then consider two alternative sets of intervals constructed with test inversion. The first set of intervals are approximate, using either asymptotic or bootstrap approximation to the finite-sample distribution of a chosen test statistic. We consider several choices of test statistic, including maximum likelihood estimators and generalized likelihood ratio statistics. We show with simulation that, at empirically relevant parameter values and sample sizes, the coverage probabilities for these intervals are close to their nominal level and are approximately equi-tailed. The second set of intervals are shown to contain the true parameter value with probability at least equal to the nominal level, but can be conservative in finite samples.
Uncertainty in the Hot Hand Fallacy: Detecting Streaky Alternatives to Random Bernoulli Sequences
(with Joseph P. Romano)
The Review of Economic Studies. Featured Article. 89 (2). March 2022.
[ Abstract | arXiv | Online Appendix ]
We study a class of permutation tests of the randomness of a collection of Bernoulli sequences and their application to analyses of the human tendency to perceive streaks of consecutive successes as overly representative of positive dependence—the hot hand fallacy. In particular, we study permutation tests of the null hypothesis of randomness (i.e., that trials are i.i.d.) based on test statistics that compare the proportion of successes that directly follow k consecutive successes with either the overall proportion of successes or the proportion of successes that directly follow k consecutive failures. We characterize the asymptotic distributions of these test statistics and their permutation distributions under randomness, under a set of general stationary processes, and under a class of Markov chain alternatives, which allow us to derive their local asymptotic power. The results are applied to evaluate the empirical support for the hot hand fallacy provided by four controlled basketball shooting experiments. We establish that substantially larger data sets are required to derive an informative measurement of the deviation from randomness in basketball shooting. In one experiment, for which we were able to obtain data, multiple testing procedures reveal that one shooter exhibits a shooting pattern significantly inconsistent with randomness – supplying strong evidence that basketball shooting is not random for all shooters all of the time. However, we find that the evidence against randomness in this experiment is limited to this shooter. Our results provide a mathematical and statistical foundation for the design and validation of experiments that directly compare deviations from randomness with human beliefs about deviations from randomness, and thereby constitute a direct test of the hot hand fallacy.