Chen RY, Westfall AO, Hardin AM, Miller-Hardwick C, Stringer JSA, Raper JL, Vermund SH, Gotuzzo E, Allison J, Saag MS
A total lymphocyte count (TLC) of 1200 cells/mL has been used as a surrogate for a CD4 count of 200 cells/microL in resource-limited settings with varying results. We developed a more effective method based on a decision tree algorithm to classify subjects.
A decision tree was used to develop models with the variables TLC, hemoglobin, platelet count, gender, body mass index, and antiretroviral treatment status of subjects from the University of Alabama at Birmingham (UAB) observational database. Models were validated on data from the Birmingham Veterans Affairs Medical Center (BVAMC) and Zambia, with primary decision trees also generated from these data.
A total of 1189 patients from the UAB observational database were included. The UAB decision tree classified a CD4 count < or =200 cells/microL as better than a TLC cut-point of 1200 cells/mL, based on the area under the curve of the receiver-operator characteristic curve (P < 0.0001). When applied to data from the BVAMC and Zambia, the UAB-based decision tree performed better than the TLC cut-point of 1200 cells/mL (BVAMC: P < 0.0001; Zambia: P = 0.0009) but worse than a decision tree based on local data (BVAMC: P < or = 0.0001; Zambia: P < or = 0.0001).
A decision tree algorithm based on local data identifies low CD4 cell counts better than one developed from a different population or a TLC cut-point of 1200 cells/mL