Self supervised contrastive learning for digital histopathology

Unsupervised learning has been a long-standing goal of machine learning and is especially important for medical image analysis, where the learning can compensate for the scarcity of labeled datasets. A promising subclass of unsupervised learning is self-supervised learning, which aims to learn salie...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Ozan Ciga, Tony Xu, Anne Louise Martel
Formato: article
Lenguaje:EN
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://doaj.org/article/7734613d34b04ca98d851e4bd8392267
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:7734613d34b04ca98d851e4bd8392267
record_format dspace
spelling oai:doaj.org-article:7734613d34b04ca98d851e4bd83922672021-11-14T04:36:01ZSelf supervised contrastive learning for digital histopathology2666-827010.1016/j.mlwa.2021.100198https://doaj.org/article/7734613d34b04ca98d851e4bd83922672022-03-01T00:00:00Zhttp://www.sciencedirect.com/science/article/pii/S2666827021000992https://doaj.org/toc/2666-8270Unsupervised learning has been a long-standing goal of machine learning and is especially important for medical image analysis, where the learning can compensate for the scarcity of labeled datasets. A promising subclass of unsupervised learning is self-supervised learning, which aims to learn salient features using the raw input as the learning signal. In this work, we tackle the issue of learning domain-specific features without any supervision to improve multiple task performances that are of interest to the digital histopathology community. We apply a contrastive self-supervised learning method to digital histopathology by collecting and pretraining on 57 histopathology datasets without any labels. We find that combining multiple multi-organ datasets with different types of staining and resolution properties improves the quality of the learned features. Furthermore, we find using more images for pretraining leads to a better performance in multiple downstream tasks, albeit there are diminishing returns as more unlabeled images are incorporated into the pretraining. Linear classifiers trained on top of the learned features show that networks pretrained on digital histopathology datasets perform better than ImageNet pretrained networks, boosting task performances by more than 28% in F1scores on average. Interestingly, we did not observe a consistent correlation between the pretraining dataset site or the organ versus the downstream task (e.g., pretraining with only breast images does not necessarily lead to a superior downstream task performance for breast-related tasks). These findings may also be useful when applying newer contrastive techniques to histopathology data. Pretrained PyTorch models are made publicly available at https://github.com/ozanciga/self-supervised-histopathology.Ozan CigaTony XuAnne Louise MartelElsevierarticleSelf supervised learningDigital histopathologyWhole slide imagesUnsupervised learningCyberneticsQ300-390Electronic computers. Computer scienceQA75.5-76.95ENMachine Learning with Applications, Vol 7, Iss , Pp 100198- (2022)
institution DOAJ
collection DOAJ
language EN
topic Self supervised learning
Digital histopathology
Whole slide images
Unsupervised learning
Cybernetics
Q300-390
Electronic computers. Computer science
QA75.5-76.95
spellingShingle Self supervised learning
Digital histopathology
Whole slide images
Unsupervised learning
Cybernetics
Q300-390
Electronic computers. Computer science
QA75.5-76.95
Ozan Ciga
Tony Xu
Anne Louise Martel
Self supervised contrastive learning for digital histopathology
description Unsupervised learning has been a long-standing goal of machine learning and is especially important for medical image analysis, where the learning can compensate for the scarcity of labeled datasets. A promising subclass of unsupervised learning is self-supervised learning, which aims to learn salient features using the raw input as the learning signal. In this work, we tackle the issue of learning domain-specific features without any supervision to improve multiple task performances that are of interest to the digital histopathology community. We apply a contrastive self-supervised learning method to digital histopathology by collecting and pretraining on 57 histopathology datasets without any labels. We find that combining multiple multi-organ datasets with different types of staining and resolution properties improves the quality of the learned features. Furthermore, we find using more images for pretraining leads to a better performance in multiple downstream tasks, albeit there are diminishing returns as more unlabeled images are incorporated into the pretraining. Linear classifiers trained on top of the learned features show that networks pretrained on digital histopathology datasets perform better than ImageNet pretrained networks, boosting task performances by more than 28% in F1scores on average. Interestingly, we did not observe a consistent correlation between the pretraining dataset site or the organ versus the downstream task (e.g., pretraining with only breast images does not necessarily lead to a superior downstream task performance for breast-related tasks). These findings may also be useful when applying newer contrastive techniques to histopathology data. Pretrained PyTorch models are made publicly available at https://github.com/ozanciga/self-supervised-histopathology.
format article
author Ozan Ciga
Tony Xu
Anne Louise Martel
author_facet Ozan Ciga
Tony Xu
Anne Louise Martel
author_sort Ozan Ciga
title Self supervised contrastive learning for digital histopathology
title_short Self supervised contrastive learning for digital histopathology
title_full Self supervised contrastive learning for digital histopathology
title_fullStr Self supervised contrastive learning for digital histopathology
title_full_unstemmed Self supervised contrastive learning for digital histopathology
title_sort self supervised contrastive learning for digital histopathology
publisher Elsevier
publishDate 2022
url https://doaj.org/article/7734613d34b04ca98d851e4bd8392267
work_keys_str_mv AT ozanciga selfsupervisedcontrastivelearningfordigitalhistopathology
AT tonyxu selfsupervisedcontrastivelearningfordigitalhistopathology
AT annelouisemartel selfsupervisedcontrastivelearningfordigitalhistopathology
_version_ 1718429906137776128