Data analysis Previous Next

The multiple imputation method: a case study involving secondary data analysis

Salimah R Walani Director of global health programs, March of Dimes Foundation, White Plains, NY, US

Charles M Cleland Senior research scientist, New York University College of Nursing, NY, US

Aim To illustrate with the example of a secondary data analysis study the use of the multiple imputation method to replace missing data.

Background Most large public datasets have missing data, which need to be handled by researchers conducting secondary data analysis studies. Multiple imputation is a technique widely used to replace missing values while preserving the sample size and sampling variability of the data.

Data source The 2004 National Sample Survey of Registered Nurses.

Review methods The authors created a model to impute missing values using the chained equation method. They used imputation diagnostics procedures and conducted regression analysis of imputed data to determine the differences between the log hourly wages of internationally educated and US-educated registered nurses.

Discussion The authors used multiple imputation procedures to replace missing values in a large dataset with 29,059 observations. Five multiple imputed datasets were created. Imputation diagnostics using time series and density plots showed that imputation was successful. The authors also present an example of the use of multiple imputed datasets to conduct regression analysis to answer a substantive research question.

Conclusion Multiple imputation is a powerful technique for imputing missing values in large datasets while preserving the sample size and variance of the data. Even though the chained equation method involves complex statistical computations, recent innovations in software and computation have made it possible for researchers to conduct this technique on large datasets.

Implications for research/practice The authors recommend nurse researchers use multiple imputation methods for handling missing data to improve the statistical power and external validity of their studies.

Nurse Researcher. 22, 5, 13-19. doi: 10.7748/nr.22.5.13.e1319

Peer review

This article has been subject to double blind peer review and checked using antiplagiarism software

Conflict of interest

None declared

Received: 23 April 2014

Accepted: 24 June 2014

Keywords :

Chained equation - missing data - multiple imputation - regression analysis - secondary data analysis - statistical methods - validity

Want to read more?

Already have access? Log in

3-month trial offer for £5.25/month

Subscribe today and save 50% on your first three months

RCNi Plus users have full access to the following benefits:

Unlimited access to all 10 RCNi Journals
RCNi Learning featuring over 175 modules to easily earn CPD time
NMC-compliant RCNi Revalidation Portfolio to stay on track with your progress
Personalised newsletters tailored to your interests
A customisable dashboard with over 200 topics

Alternatively, you can purchase access to this article for the next seven days. Buy now

Are you a student? Our student subscription has content especially for you.
Find out more

15 May 2015 / Vol 22 issue 5

TABLE OF CONTENTS

DIGITAL EDITION