Facilitating cancer research using natural language processing of pathology reports

Hua Xu, Kristin Anderson, Victor R. Grann, Carol Friedman

Research output: Chapter in Book/Report/Conference proceedingChapter

30 Scopus citations


Many ongoing clinical research projects, such as projects involving studies associated with cancer, involve manual capture of information in surgical pathology reports so that the information can be used to determine the eligibility of recruited patients for the study and to provide other information, such as cancer prognosis. Natural language processing (NLP) systems offer an alternative to automated coding, but pathology reports have certain features that are difficult for NLP systems. This paper describes how a preprocessor was integrated with an existing NLP system (MedLEE) in order to reduce modification to the NLP system and to improve performance. The work was done in conjunction with an ongoing clinical research project that assesses disparities and risks of developing breast cancer for minority women. An evaluation of the system was performed using manually coded data from the research project's database as a gold standard. The evaluation outcome showed that the extended NLP system had a sensitivity of 90.6% and a precision of 91.6%. Results indicated that this system performed satisfactorily for capturing information for the cancer research project.

Original languageEnglish (US)
Title of host publicationStudies in Health Technology and Informatics
Number of pages8
EditionPt 1
StatePublished - 2004


  • Natural Language Processing
  • Pathology


Dive into the research topics of 'Facilitating cancer research using natural language processing of pathology reports'. Together they form a unique fingerprint.

Cite this