Citation

Project Citation: 

Pankowska, Paulina. Correcting for measurement error in categorical, longitudinal data using hidden Markov models . Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2020-07-28. https://doi.org/10.3886/E120363V1

Persistent URL:  http://doi.org/10.3886/E120363V1

Project Description

Project Title:  View help for Project Title Correcting for measurement error in categorical, longitudinal data using hidden Markov models
Summary:  View help for Summary This project focuses on the problem of measurement error and investigates the feasibility of using hidden Markov models (HMMs) to correct for such error in categorical, longitudinal data. In doing so, we have first illustrate how measurement error poses a substantial threat to the validity and accuracy of estimates. We then demonstrate the need to use multiple-indicator HMM specifications, which can account for the nonignorable presence of systematic/dependent errors. Finally, we show that the use of such extended models is feasible. That is, even though such HMMs require record linkage, linkage error is largely not a problem. Furthermore, while their implementation process is complex and time-consuming, it can be simplified because error parameters can be re-used for a number of years. 
Funding Sources:  View help for Funding Sources Statistics Netherlands (CBS)
Contributor(s):  View help for Contributor(s) Consultant to project : Daniel Oberski, Utrecht University; Consultant to project : Dimitris Pavlopoulos , Vrije Universiteit Amsterdam; Consultant to project : Bart Bakker, Statistics Netherlands (CBS)

Scope of Project

Subject Terms:  View help for Subject Terms measurement error; data linkage; hidden Markov models ; latent class modeling
Linked Resource(s):  View help for Linked Resource(s) Dutch Labour Force Survey (LFS) () ; Dutch Employment Register (ER) (The ER is an administrative dataset that combines information from various sources but predominantly consists of tax related data provided to the Dutch Tax Authorities by employers. It is managed by the Dutch Employee Insurance Agency (UWV) and contains monthly information for all insured employees in the Netherlands on such individual-level characteristics as wages, benefits, and labor relations.)
Geographic Coverage:  View help for Geographic Coverage The Netherlands
Time Period(s):  View help for Time Period(s) 2007 ? 2010
Universe:  View help for Universe Individuals on the Dutch labour market aged 25 to 55. 
Data Type(s):  View help for Data Type(s) administrative records data; survey data

Methodology

Response Rate:  View help for Response Rate In general, according to colleagues from Statistics Netherlands, the response rate in the LFS was around 61% in 2009 and 53% in 2010. However, as in our analysis we used a sub-sample of the LFS data and selected (i) only individuals between 25 and 55 years of age and (ii) only those who could be linked to the ER data, we do not know the response rates for our sample. Statistics Netherlands has also indicated that the LFS is subject to relatively high panel attrition, which also leads to selectivity, but the exact rates are unknown

While the ER officially cannot be subject to drop-out as submission of reports is obligatory for all employers, 2,619 observations (out of a total of 133,290) are missing. 
Sampling:  View help for Sampling The LFS is a sample survey. 

The ER covers all individuals who are employed in the Netherlands.

Data Source:  View help for Data Source Statistics Netherlands (CBS)
Weights:  View help for Weights In the LFS the weighting of the observations is twofold. First, inclusion weights are assigned to the observations. These weights correct for biased inclusion probabilities that are caused by the sampling method. Second, the final weights are constructed (by adjusting for sex, age, country of origin, official place of residence and some other regional classifications). These weights are used to reduce non-response bias.

While in our analyses the inclusion of weights did not significantly affect the results and therefore we decided to exclude them, this might not be the case in other applications, in particular when the weights vary substantially across respondents. 
Unit(s) of Observation:  View help for Unit(s) of Observation Individuals

Name Size File Type Download/
Preview
file Linkage_paper_code_simulation_results.zip 285.2 KB application/zip Download

Published Versions

Export Metadata

Report a Problem

Found a serious problem with the data, such as disclosure risk or copyrighted content? Let us know.

This material is distributed exactly as it arrived from the data depositor. ICPSR has not checked or processed this material. Users should consult the investigator(s) if further information is desired.