Development of an HIV Risk Prediction Model Using Electronic Health Record Data from an Academic Health System in the Southern United States.

View Abstract

BACKGROUND

HIV pre-exposure prophylaxis (PrEP) is underutilized in the southern United States. Rapid identification of individuals vulnerable to diagnosis of HIV using electronic health record (EHR)-based tools may augment PrEP uptake in the region.

METHODS

Using machine learning, we developed EHR-based models to predict incident HIV diagnosis as a surrogate for PrEP candidacy. We included patients from a southern medical system with encounters between October 2014 and August 2016, training the model to predict incident HIV diagnosis between September 2016 and August 2018. We obtained 74 EHR variables as potential predictors. We compared Extreme Gradient Boosting (XGBoost) versus least absolute shrinkage selection operator (LASSO) logistic regression models, and assessed performance, overall and among women, using area under the receiver operating characteristic curve (AUROC) and area under precision recall curve (AUPRC).

RESULTS

Of 998,787 eligible patients, 162 had an incident HIV diagnosis, of whom 49 were women. The XGBoost model outperformed the LASSO model for the total cohort, achieving an AUROC of 0.89 and AUPRC of 0.01. The female-only cohort XGBoost model resulted in an AUROC of 0.78 and AUPRC of 0.00025. The most predictive variables for the overall cohort were race, sex, and male partner. The strongest positive predictors for the female-only cohort were history of pelvic inflammatory disease, drug use, and tobacco use.

CONCLUSIONS

Our machine-learning models were able to effectively predict incident HIV diagnoses including among women. This study establishes feasibility of using these models to identify persons most suitable for PrEP in the South.

Abbreviation
Clin Infect Dis
Publication Date
2022-09-19
Pubmed ID
36125084
Medium
Print-Electronic
Full Title
Development of an HIV Risk Prediction Model Using Electronic Health Record Data from an Academic Health System in the Southern United States.
Authors
Burns CM, Pung L, Witt D, Gao M, Sendak M, Balu S, Krakower D, Marcus JL, Okeke NL, Clement ME