Image Descriptors for Weakly Annotated Histopathological Breast Cancer Data

Panagiotis Stanitsas, Anoop Cherian, Vassilios Morellas, Resha Tejpaul, Nikolaos Papanikolopoulos, Alexander Truskinovsky

Research output: Contribution to journalArticlepeer-review

2 Scopus citations


Introduction: Cancerous Tissue Recognition (CTR) methodologies are continuously integrating advancements at the forefront of machine learning and computer vision, providing a variety of inference schemes for histopathological data. Histopathological data, in most cases, come in the form of high-resolution images, and thus methodologies operating at the patch level are more computationally attractive. Such methodologies capitalize on pixel level annotations (tissue delineations) from expert pathologists, which are then used to derive labels at the patch level. In this work, we envision a digital connected health system that augments the capabilities of the clinicians by providing powerful feature descriptors that may describe malignant regions. Material and Methods: We start with a patch level descriptor, termed Covariance-Kernel Descriptor (CKD), capable of compactly describing tissue architectures associated with carcinomas. To leverage the recognition capability of the CKDs to larger slide regions, we resort to a multiple instance learning framework. In that direction, we derive the Weakly Annotated Image Descriptor (WAID) as the parameters of classifier decision boundaries in a Multiple Instance Learning framework. The WAID is computed on bags of patches corresponding to larger image regions for which binary labels (malignant vs. benign) are provided, thus obviating the necessity for tissue delineations. Results: The CKD was seen to outperform all the considered descriptors, reaching classification accuracy (ACC) of 92.83%. and area under the curve (AUC) of 0.98. The CKD captures higher order correlations between features and was shown to achieve superior performance against a large collection of computer vision features on a private breast cancer dataset. The WAID outperform all other descriptors on the Breast Cancer Histopathological database (BreakHis) where correctly classified malignant (CCM) instances reached 91.27 and 92.00% at the patient and image level, respectively, without resorting to a deep learning scheme achieves state-of-the-art performance. Discussion: Our proposed derivation of the CKD and WAID can help medical experts accomplish their work accurately and faster than the current state-of-the-art.

Original languageEnglish (US)
Article number572671
JournalFrontiers in Digital Health
StatePublished - Dec 7 2020

Bibliographical note

Publisher Copyright:
© Copyright © 2020 Stanitsas, Cherian, Morellas, Tejpaul, Papanikolopoulos and Truskinovsky.


  • annotated data
  • connected health and computer vision
  • connected health for breast cancer
  • histopathological data
  • image descriptors


Dive into the research topics of 'Image Descriptors for Weakly Annotated Histopathological Breast Cancer Data'. Together they form a unique fingerprint.

Cite this