Advances in Inference for Minimization and Other Covariate-Adaptive Randomization Methods
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Covariate-adaptive randomization (CAR) is widely used in randomized controlled experiments and can balance treatment allocation over important prognostic covariates. Despite its popularity, data collected under CAR designs are correlated, which makes subsequent statistical inference complicated. In this thesis, we address several issues concerning inference under CAR designs.
Firstly, valid inference with data collected under covariate-adaptive randomization in many cases requires the knowledge of the limiting covariance matrix of within-stratum imbalances. While the limit is explicitly known for most CAR methods, this is not the case for Pocock and Simon’s minimization method, under which the existence of the limit is only recently established. This limit can be estimated by Monte Carlo methods if the distribution of stratification factors is known. However, this assumption may not hold in practice, resulting in invalid estimation methods. In the first work, we replace the usually unknown distribution with an estimator, such as the empirical distribution, in the Monte Carlo approach and establish consistency of the resulting covariance estimator. As an application, we consider adjustments to existing robust tests for treatment effects with survival data by the proposed covariances estimator in simulation studies. The result shows that the adjusted tests achieve a size close to the nominal level, and unlike other designs, the robust tests without adjustment may have an asymptotic size inflation issue under the minimization method.
Secondly, we consider the inference of average treatment effect, a critical estimand in practice, under CARs. Several valid inferential procedures have been proposed with different working models for covariates adjustment. We focus on the approach that uses advanced machine learning methods to reduce estimators’ asymptotic variance. In the literature, a critical condition on the in-sample prediction error of these fitting methods is required to control a potential bias. We show, via a numerical example, that the condition fails to hold, and the estimation could be biased without sample splitting and cross-fitting. Further, we propose a cross-fitting procedure that fulfills this condition, and thus achieves asymptotically valid estimation. We corroborate the theory with several simulation studies.

