Names and addresses of presenters:
Geert Verbeke , Biostatistical Centre, K.U.Leuven
Kapucijnenvoer 35, B-3000 Leuven, Belgium
Tel: +32-16-336891, Sec: +32-16-336892, Fax: 32-16-337015
Geert Molenberghs , Center for Statistics, Limburgs Universitair Centrum
Universitaire Campus, Building D, B-3590 Diepenbeek, Belgium
Tel: +32-11-268238, Sec: +32-11-268202, Fax: +32-11-268299
Abstract of the course
Starting from a brief introduction on the linear mixed model for continuous longitudinal data, extensions will be formulated to model outcomes of a categorical nature, including counts and binary data. Based on Verbeke and Molenberghs (2005), several families of models will be discussed and compared, from an interpretational as well as computational point of view. First, models will be discussed for the full marginal distribution of the outcome vector. This allows model fitting to be based on maximum likelihood principles, immediately implying inferential tools for all parameters in the models. The main disadvantage of such models is that they require complete specification of all higher-order interactions, which is often based on unrealistic assumptions, and often lead to computational problems, especially in examples with many repeated measurements per subject. Therefore, alternatives have been formulated in the statistical literature. First, following the reasoning in the linear mixed models, a full marginal model can be obtained from a random effects approach, where association between repeated measurements within the same subject is believed to be generated by underlying unobserved random effects. Alternatively, semiparametric methods can be used which do no longer require full specification of the likelihood, only of the first moments or of the first and second moments. This leads to the so-called generalized estimating equations. For both approaches, estimation and inference will be discussed and illustrated in full detail, and it will be extensively argued that both approaches yield parameters with completely different interpretations. Advantages and disadvantages of both will be discussed in full detail. Finally, when analysing longitudinal data, one is often confronted with missing observations, i.e., scheduled measurements have not been made, due to a variety of (known or unknown) reasons. It will be shown that, if no appropriate measures are taken, missing data can cause seriously biased results, and interpretational difficulties. Methods to properly analyse incomplete data, under flexible assumptions, are presented. Key concepts of sensitivity analysis are introduced. Without putting too much emphasis on software, some examples will be given on how the different approaches can be implemented within the SAS software package. Throughout the course, it will be assumed that the participants are familiar with basic statistical modelling, including linear models (regression and analysis of variance), as well as generalized linear models (logistic and Poisson regression). Moreover, pre-requisite knowledge should also include general estimation and testing theory (maximum likelihood, likelihood ratio).
Four sessions of 1.5 hours are planned as follows:
The targeted audience includes applied statisticians and biomedical researchers in industry, public health organizations, contract research organizations, and academia.
Learning outcomes and instructional methods:
As a result of the course, participants should be able to perform a basic analysis for a particular longitudinal data set at hand. Based on a selection of exploratory tools, the nature of the data, and the research questions to be answered in the analyses, they should be able to construct an appropriate statistical model, to fit the model within the SAS framework, and to interpret the obtained results. Further, participants should be aware not only of the possibilities and strengths of a particular selected approach, but also of its drawbacks in comparison to other methods.
The course will be explanatory rather than mathematically rigorous. Emphasis is on giving sufficient detail in order for participants to have a general overview of frequently used approaches, with their advantages and disadvantages, while giving reference to other sources where more detailed information is available. Also, it will be explained in detail how the different approaches can be implemented in the SAS package, and how the resulting outputs should be interpreted.
Geert Verbeke is Professor in Biostatistics at the Biostatistical Centre of the Katholieke Universiteit Leuven in Belgium. He wrote his dissertation as well as a number of methodological papers, on various aspects of linear mixed models for longitudinal data.
Geert Molenberghs is Professor in Biostatistics at the Limburgs Universitair Centrum in Belgium. He published methodological work on repeated categorical data and on the analysis of nonresponse in clinical and epidemiological studies.
Both presenters are editor and author of three books on the use of linear mixed models for the analysis of
longitudinal data (Springer Lecture Notes 1997, Springer Series in Statistics 2000, Springer Series in Statistics 2005), and they have
taught several (short) courses on the topic in universities as well as industry. They received the 2002 and 2004 CE award for courses
taught at the Joint Statistical Meetings in New-York and Toronto.