How to Analyse Survey Data: Methods, Software and Applications
Venue: Highfield Campus, University of Southampton, UK
Presenter: Prof Danny Pfeffermann, Prof Patrick Sturgis, Dr Moshe Feder and Dr Dave Holmes
Dates of Course: Wednesday 19th - Friday 21st September 2012
This course has already run. Please check the course listings for a future course.
Summary of Course:
Survey data are frequently used for analytic inference on statistical models holding for the population from which the sample is taken. Familiar examples include the analysis of labour market dynamics from labour force surveys, comparisons of pupils' achievements from educational surveys and estimation of causal relationships between risk factors and disease prevalence from health surveys. Survey data differ, however, from other data sets in four major aspects: 1- The samples are selected with known selection probabilities, which allows using the distribution over all possible sample selections as the basis for inference. 2- The selection probabilities may be related to the model outcome variable, in which case the model holding for the sample is different from the target population model. 3- Survey data are almost inevitably subject to nonresponse, which again may distort the population model if the response propensity is associated with the outcome. 4- The sample data are often clustered, implying that observations in the same cluster are correlated.
In this course we shall discuss and illustrate these problems and consider a large number of alternative approaches to address them. Available and new computer programs used to implement these approaches will be discussed, with examples using simulated and real data sets.
This course is based on the outcomes of a two-year research conducted by the four presenters, funded by an ESRC grant No. RES-062-23-2316
- Provide participants with an understanding of the unique features of survey data and why they need to be modeled differently.
- Familiarise participants with existing and new approaches developed to deal with these problems and discuss the pros and cons of each approach
- Discuss goodness of fit statistics and other measures that can be used for choosing one or more appropriate approaches for a given application
- Review available and new computer programs that can be used for implementing the various approaches and discuss their main features*
Illustrate the application of the new approaches and the use of computer programs using simulated and real data sets.(*) The course will include several lab sessions
This course will include the following topics:
- Problems associated with the modeling of survey data
- Design-based approaches to deal with these problems
- Parametric model-based methods
- Semi-parametric and empirical likelihood methods
- Model diagnostics and comparisons between different approaches
- Review basic features of existing and new computer programs
- Illustrate application of the various approaches using several lab sessions
The course is aimed at social science researchers and statisticians at all career stages in the academic, government and private sectors who wish to obtain an understanding of a range of possible approaches for the analysis of complex survey data, how to apply them and how to choose between them using both theoretical and empirical decision rules.
Participants are expected to have basic knowledge of statistical modeling and in particular, the basic steps of fitting linear regression and logistic regression models. The course will contain several lab sessions to enable the participants to work through examples. Most of the lab sessions will use the R software but no prior knowledge of R is required. (For course participants new to the package an exercise sheet will be provided prior to the course to enable them to work through examples.)
Participants will receive written course notes.
Danny Pfeffermann is Professor of statistics at Southampton Statistical Sciences Research Institute (S3RI), University of Southampton, and at the Hebrew University of Jerusalem, Israel, and is consultant for the US Bureau of Labour Statistics. His main research areas are analytic inference from complex sample surveys, seasonal adjustment and trend estimation, small area estimation, observational studies and not missing at random nonresponse. He is Fellow of the American Statistical Association and the 2011 recipient of the Waksberg award for outstanding contributions to survey methodology. Danny published many articles on the subject of modelling of survey data and is the co-editor of the new two-volume handbook in statistics on "Sample Surveys". He was Associate Editor of the Journal of Statistical Planning and Inference and Biometrika, and is presently Associate Editor of Survey Methodology. Danny served for two years as the president of the Israel Statistical Society and is the president elect of the International Association of Survey Statisticians (IASS).
Patrick Sturgis is Professor of Research Methodology and Director of the ESRC National Centre for Research Methods. He has a BA in Psychology from the University of Liverpool and a Master of Science and PhD in social psychology from the London School of Economics. His research interests are in the areas of survey methodology, statistical modelling, public opinion and political behaviour, public understanding of science and technology, social capital, and social mobility. He is Principal Investigator of the Wellcome Monitor Study and President of the European Survey Research Association (ESRA).
Moshe Feder is a Research Fellow at Southampton Statistical Sciences Research Institute (S3RI), University of Southampton. Formerly, he was a Senior Research Statistician at the Research Triangle Institute in the U.S.A where he has acquired extensive experience in modelling a diverse range of data, developing new methodologies or adapting existing methods. He is an elected member of the International Statistical Institute and an author of scientific publications ranging from survey data analysis, time series analysis, response reliability, modelling of birth outcomes, etc. He also headed a number of statistics and analytics teams. Moshe has an extensive background in both mathematics and computing.
David Holmes is Principal Experimental Officer in the Division of Social Statistics and Demography at the University of Southampton. He has a BSc in Mathematics from the University of Exeter, an MSc in Applied Statistics and a PhD in Social Statistics from the University of Southampton. His research interests are in the area of survey methodology, and he currently provides advice to Nexus (formerly the Tyne and Wear Passenger Transport Executive) on the design and estimation methods used for a continuous passenger monitoring survey carried out in Tyne and Wear. He has also worked on many statistical methodology projects as part of S3RI's contract with the Office for National Statistics.
£30 per day for UK-registered students. £60 per day for staff from UK academic institutions (including research centres), ESRC funded researchers and UK registered charitable organisations and all other participants. This course offers a reduced fee for all other participants (usually £220 per day) as it is funded by an ESRC grant No. RES-062-23-2316. The course fee includes course materials, lunches and morning and afternoon refreshments. Travel and accommodation are to be arranged and paid for by the participant.
Location and Accommodation:
The course will be held at the Southampton Statistical Sciences Research Institute, Building 39, University of Southampton, Southampton, SO17 1BJ. Participants will need to make their own accommodation arrangements. Further information on accommodation and course location is available here.
The course will begin with coffee and registration at 9.30 a.m. on the first day and finish at 5.00 p.m. on the last day.
For participants who wish to do background reading, the following references may be useful. Please note that although reading is optional, participants who have little statistical background in analysis of survey data are strongly advised to look at some of these references.
Binder, D. and Roberts, G. (2009). Design and model based inference for model parameters. In Handbook of Statistics 29B; Sample Surveys: Inference and Analysis (Eds., D. Pfeffermann and C.R. Rao), Amsterdam: North Holland, 33-54.
Chambers, R. L. and Skinner, C. J. (2003, Eds.). Analysis of survey data. New York: Wiley.
Godambe, V.P. and Thompson, M.E. (2009). Estimating functions and survey sampling. In Handbook of Statistics 29B; Sample Surveys: Inference and Analysis (Eds., D. Pfeffermann and C.R. Rao), Amsterdam: North Holland, 83-101.
Pfeffermann, D. (1993). The role of sampling weights when modeling survey data. International Statistical Review, 61, 317-337.
Pfeffermann, D. (2011). Modelling of complex survey data: Why model? Why is it a problem? How can we approach it? Survey Methodology, 37, 115-136.