Synthesized Image Training Techniques: On Improving Model Performance Using Confusion

Azeez Idris, Mohammed Khaleel, Wallapak Tavanapong, Piet C. De Groen

Research output: Contribution to journalArticlepeer-review

Abstract

The performance of supervised deep learning image classifiers has significantly improved with large, labeled datasets and increased computing power. However, obtaining large, labeled image datasets in areas like medicine is expensive. This study seeks to improve model performance on limited labeled datasets by reducing confusion. We observed that misclassification (or confusion) between classes is usually more prevalent between specific classes. Thus, we developed a synthesized image training technique (SIT2), a novel confusion-based training framework that identifies pairs of classes with high confusion and synthesizes not-sure images from these pairs. The not-sure images are utilized in three new training strategies as follows: (1) the not-sure training strategy pretrains a model using not-sure images and the original training images, (2) the sure-or-not strategy pretrains with synthesized sure or not-sure images, and (3) the multi-label strategy pretrains with synthesized images but predicts the original class(es) of the synthesized images. Finally, the pretrained model is fine-tuned on the original dataset. An extensive evaluation was conducted on five medical and nonmedical datasets. Several improvements are statistically significant, which shows the promising future of our confusion-based training framework.

Original languageEnglish (US)
Article number1856
JournalACM Transactions on Multimedia Computing, Communications and Applications
Volume21
Issue number1
DOIs
StatePublished - Dec 14 2024

Bibliographical note

Publisher Copyright:
© 2024 Copyright held by the owner/author(s).

Keywords

  • Deep learning
  • learning from confusion
  • model confusion
  • transfer learning

Fingerprint

Dive into the research topics of 'Synthesized Image Training Techniques: On Improving Model Performance Using Confusion'. Together they form a unique fingerprint.

Cite this