Impact of Imputation of Missing Data on Estimation of Survival Rates: An Example in Breast Cancer

Mohammad Reza  Baneshi; AR Talei

International Journal of Cancer Management

The Official Journal of Cancer Research Center (CRC), Shahid Beheshti University of Medical Sciences

Outlines

Abstract
Copyright

Impact of Imputation of Missing Data on Estimation of Survival Rates: An Example in Breast Cancer

Author(s):

Mohammad Reza Baneshi^1,*,

AR Talei²

1Health School, Kerman University of Medical Sciences, Deptartment of Biostatistics and Epidemiology, Kerman, Iran

2Shahid Faghihi Hospital, Shiraz University of Medical Sciences, Shiraz, Iran

International Journal of Cancer Management:Vol. 3, issue 3; e80700

Published online:Sep 30, 2010

Article type:Research Article

Received:Jan 12, 2010

Accepted:Jun 21, 2010

How to Cite:Baneshi M R, Talei A. Impact of Imputation of Missing Data on Estimation of Survival Rates: An Example in Breast Cancer.Int J Cancer Manag.2010;3(3):e80700.

Abstract

Background: Multifactorial regression models are frequently used in medicine to estimate survival rate of patients across risk groups. However, their results are not generalisable, if in the development of models assumptions required are not satisfied. Missing data is a common problem in pathology. The aim of this paper is to address the danger of exclusion of cases with missing data, and to highlight the importance of imputation of missing data before development of multifactorial models.

Methods: This study was performed on 310 breast cancer patients diagnosed in Shiraz (Southern Iran). Performing a complete-case Cox regression model, a prognostic index was calculated so as to categorise the patients into 3 risk groups. Then, applying the Multivariate Imputation via Chained Equations (MICE) method, missing data were imputed 10 times. Using imputed data sets, modelling was performed to assign patients into risk groups. Estimated actuarial Overal Survival (OS) rates corresponding to analysis of complete-case and imputed data sets were compared.

Results: Cases with at least one missing datum experienced a significantly better survival curve. Estimates derived analysing complete-case data, relative to imputed data sets, underestimated the OS rate in all risk groups. In addition confidence intervals were wider indicating loss in precision due to attrition in sample size and power.

Conclusion: Results obtained highlighted the danger of exclusion of missing data. Imputation of missing data avoids biased estimates, increases the precision of estimates, and improves genralisability of results to other similar populations.

Keywords

Fulltext

Full text is available in PDF.

Import into EndNote Import into BibTex

Share on

Comments

Number of Comments:0

Metrics

Purchasing Reprints

Copyright Clearance Center (CCC) handles bulk orders for article reprints for Brieflands. To place an order for reprints, please click here ( https://www.copyright.com/landing/reprintsinquiryform/ ). Clicking this link will bring you to a CCC request form where you can provide the details of your order. Once complete, please click the ‘Submit Request’ button and CCC’s Reprints Services team will generate a quote for your review.

Search Relations

Author(s):

Mohammad Reza Baneshi:[PubMed][Scholar]
AR Talei:[PubMed][Scholar]