Many scientific domains gather sufficient labels to train machine algorithms through human-in-the-loop techniques provided by the Zooniverse.org citizen science platform. As the range of projects, task types and data rates increase, acceleration of model training is of paramount concern to focus volunteer effort where most needed. The application of Transfer Learning (TL) between Zooniverse projects holds promise as a solution. However, understanding the effectiveness of TL approaches that pretrain on large-scale generic image sets vs. images with similar characteristics possibly from similar tasks is an open challenge. We apply a generative segmentation model on two Zooniverse project-based data sets: (1) to identify fat droplets in liver cells (FatChecker; FC) and (2) the identification of kelp beds in satellite images (Floating Forests; FF) through transfer learning from the first project. We compare and contrast its performance with a TL model based on the COCO image set, and subsequently with baseline counterparts. We find that both the FC and COCO TL models perform better than the baseline cases when using > 75% of the original training sample size. The COCO-based TL model generally performs better than the FC-based one, likely due to its generalized features. Our investigations provide important insights into usage of TL approaches on multi-domain data hosted across different Zooniverse projects, enabling future projects to accelerate task completion.
|Original language||English (US)|
|Journal||CEUR Workshop Proceedings|
|State||Published - 2022|
|Event||2022 International Conference on Information and Knowledge Management Workshops, CIKM-WS 2022 - Atlanta, United States|
Duration: Oct 17 2022 → Oct 21 2022
Bibliographical noteFunding Information:
We also thank Lucy Collinson and the Electron Microscopy Science Technology Platform (The Francis Crick Institute, London UK) for their input into this project. This work was supported in part by the Francis Crick Institute which receives its core funding from Cancer Research UK (FC001999), the UK Medical Research Council (FC001999), and the Wellcome Trust (FC001999). This project has been made possible in part by grant number 2020-225438 from the Chan Zuckerberg Initiative DAF, an advised fund of Silicon Valley Community Foundation (H.S.). This publication uses data generated via the Zooniverse.org platform, development of which is funded by generous support, including a Global Impact Award from Google, and by a grant from the Alfred P. Sloan Foundation.
The authors would like to thank the Zooniverse volunteers without whom this work would not have been possible. RS, KM, LF, YZ, LT would like to acknowledge partial support from the National Science Foundation under grant numbers IIS 2006894 and OAC 1835530. Partial support by RS, KM, LF, TP, MS, TC, JS is acknowledged through Minnesota Partnership MNP IF#119.09.
© 2022 Copyright for this paper by its authors.
- focal tversky loss
- generative adversarial neural networks
- patch-based discriminator
- transfer learning
- UNET generator