CONFERENCIER : Peter McCullagh (University of Chicago)

TITRE : Sampling bias in logistic models

LIEU : McGill, Burnside Hall, 805 Sherbrooke O., 1B39

DATE : Le vendredi 7 novembre 2008

HEURE : 15 h 30

RESUME :

This talk is concerned with regression models for the effect of covariates on correlated binary and correlated polytomous responses. In a generalized linear mixed model, correlations are induced by a random effect, additive on the logistic scale, so that the joint distribution $p_{\bfx}(\bfy)$ obtained by integration depends on the covariate values $\bfx$ on the sampled units. The thrust of this talk is that the conventional formulation is inappropriate for most natural sampling schemes in which the sampled units arise from a random process. The conventional analysis incorrectly predicts parameter attenuation due to the random effect, thereby giving a misleading impression of the magnitude of treatment effects. The error in the conventional analysis is a subtle consequence of selection bias that arises from random sampling of units. This talk will describe a non-standard but mathematically natural formulation in which the units are auto-generated by an explicit process and sampled following a well-determined plan. For a quota sample in which the covariate configuration $\bfx$ is pre-specified, the model distribution coincides with $p_{\bfx}(\bfy)$ in the GLMM. However, if the sample units are selected at random, either by sequential recruitment or by simple random sampling from the available population, the conditional distribution $p(\bfy \given \bfx)$ is different from $p_\bfx(\bfy)$. By contrast with conventional models, conditioning on~$\bfx$ is not equivalent to stratification by~$\bfx$. The implications for likelihood computations and estimating equations will be discussed.