Dichotomisation of Continuous Data: Review of Methods, Advantages, and Disadvantages

Mohammad Reza  Baneshi; AR Talei

International Journal of Cancer Management

The Official Journal of Cancer Research Center (CRC), Shahid Beheshti University of Medical Sciences

Outlines

Abstract
Copyright

Dichotomisation of Continuous Data: Review of Methods, Advantages, and Disadvantages

Author(s):

Mohammad Reza Baneshi^1,*,

AR Talei²

1Health School, Kerman University of Medical Sciences, Department of Biostatistics and Epidemiology, Kerman, Iran

2Shahid Faghihi Hospital, Shiraz University of Medical Sciences, Shiraz, Iran

International Journal of Cancer Management:Vol. 4, issue 1; e80724

Published online:Mar 30, 2011

Article type:Research Article

Received:Aug 29, 2010

Accepted:Dec 04, 2010

How to Cite:Baneshi MR, Talei A. Dichotomisation of Continuous Data: Review of Methods, Advantages, and Disadvantages. Int J Cancer Manag. 2011;4(1):e80724. doi:

Abstract

Background: In medical research, dichotomisation of continuous variables is a widespread use approach. However, it has been argued that dichotomisation might be waste of information. The aim of this paper is to review the main methods to dichotomise continuous data, to address practical issues around dichotomisation methods, and to investigate whether dichotomisation is always a bad idea.

Methods: A total of 310 breast cancer patients were recruited. Information on 3 categorical and 1 continuous variable (age at diagnosis) was available. Missing data were imputed applying the Multivariable Imputation via Chained Equations (MICE) method. Then a minimum P-value method was applied to dichotomise the age variable. The Cox regression model was fitted to develop models in which dichotomised versus continuous version of the age variable plus other 3 variables were used. Results were compared in terms of discrimination ability, goodness of fit, and classification improvement.

Results: For the age variable, an optimal split at 47 was found. This split was close to menopause age of women in Shiraz (48) so had biological interpretability. The stability of optimal split was confirmed in bootstrap study. Model in which dichotomised version of age was used showed higher discrimination ability and goodness of fit. Furthermore, dichotomised model assigned 14% of live patients into a more appropriate risk group.

Discussion: Dichotomisation of continuous data is a contentious issue. We have shown that dichotomisation might improve performance of models when it has biological interpretation. More research is needed to understand situations in which dichotomisation might work.

Keywords

Fulltext

The Full text is available in PDF.

Import into EndNote Import into BibTex

Share on

Comments

Number of Comments:0

Cited by

Scopus by DOI: 0
Last Update: 5 days ago
Scopus by Title: 15
Last Update: 5 days ago
Scopus by Title (Ref): 15
Last Update: 5 days ago
CrossRef: 0
Last Update: 6 days ago

Metrics

Ordering Reprints

Articles are published under the Creative Commons license stated on each article. No permission or royalty fee is required for uses permitted by that license. CCC handles optional bulk and customized reprint orders. Any quotation covers production and delivery services only, not copyright permission. > Request Reprints from CCC

Search Relations

Author(s):

Mohammad Reza Baneshi:[PubMed][Scholar]
AR Talei:[PubMed][Scholar]

The Official Journal of Cancer Research Center (CRC), Shahid Beheshti University of Medical Sciences

Outlines

Dichotomisation of Continuous Data: Review of Methods, Advantages, and Disadvantages

Abstract

Fulltext

Last Update: 5 days ago

Last Update: 5 days ago

Last Update: 5 days ago

Last Update: 6 days ago