A hybrid approach to identify subsequent breast cancer using pathology and automated health information data.

PURPOSE

Many cancer registries do not capture recurrence; thus, outcome studies have often relied on time-intensive and costly manual chart reviews. Our goal was to build an effective and efficient method to reduce the numbers of chart reviews when identifying subsequent breast cancer (BC) using pathology and electronic health records. We evaluated our methods in an independent sample.

METHODS

We developed methods for identifying subsequent BC (recurrence or second primary) using a cohort of 17,245 women diagnosed with early-stage BC from 2 health plans. We used a combination of information from pathology report reviews and an automated data algorithm to identify subsequent BC (for those lesions without pathologic confirmation). Test characteristics were determined for a developmental (N=175) and test (N=500) set.

RESULTS

Sensitivity and specificity of our hybrid approach were robust [96.7% (87.6%-99.4%) and 92.1% (85.1%-96.1%), respectively] in the developmental set. In the test set, the sensitivity, specificity, and negative predictive value were also high [96.9% (88.4%-99.5%), 92.4% (89.4%-94.6%), and 99.5% (98.0%-99.0%), respectively]. The positive predictive value was lower (65.6%, 55.2%-74.8%). Chart review was required for 10.9% of the 17,245 women; 2946 (17.0%) women developed subsequent BC over a 14-year period. The date of subsequent BC identified by the algorithm was concordant with full chart reviews.

CONCLUSIONS

We developed an efficient and effective hybrid approach that decreased the number of charts needed to be manually reviewed by approximately 90%, to determine subsequent BC occurrence and disease-free survival time.

Investigators

Suzanne Fletcher

Abbreviation

Med Care

Publication Date

2015-04-01

Volume

Issue

Page Numbers

380-5

Pubmed ID

25769058

Medium

Full Title

A hybrid approach to identify subsequent breast cancer using pathology and automated health information data.

Authors

Haque R, Shi J, Schottinger JE, Ahmed SA, Chung J, Avila C, Lee VS, Cheetham TC, Habel LA, Fletcher SW, Kwan ML