PURPOSE
The purpose of this review is to assist researchers in developing, using, and interpreting case-identifying algorithms in electronic healthcare databases.
METHODS
We review clinical characteristics of health outcomes, data settings and informatics, and epidemiologic and statistical methods aspects as they pertain to the development and use of case-identifying algorithms.
RESULTS
We offer a framework for thinking critically about the use of electronic health insurance data and electronic health records to identify the occurrence of health outcomes. Accuracy of case ascertainment in database research depends on many factors, including clinical and behavioral aspects of the health outcome, and details of database construction as it pertains to completeness and reliability of database content. Existing methods for diagnostic and screening tests, misclassification, validation studies, and predictive modelling can be usefully applied to improve case ascertainment in database research.
CONCLUSIONS
Good case-identifying algorithms are based on a sound understanding of care-seeking behavior and patterns of clinical diagnosis and treatment in the study population and details about the construction and characteristics of the database. Researchers should use quantitative bias analyses to take into account the performance characteristics of case-identifying algorithms and their impact on study results.