The PatchCamelyon benchmark dataset (PCAM)
Bas Veeling

folder pcam (10 files)
filecamelyonpatch_level_2_split_valid_y.h5.gz 3.04kB
filecamelyonpatch_level_2_split_valid_x.h5.gz 805.97MB
filecamelyonpatch_level_2_split_train_y.h5.gz 21.38kB
filecamelyonpatch_level_2_split_valid_meta.csv 1.85MB
filecamelyonpatch_level_2_split_train_x.h5.gz 6.42GB
filecamelyonpatch_level_2_split_train_mask.h5.gz 14.48MB
filecamelyonpatch_level_2_split_train_meta.csv 15.05MB
filecamelyonpatch_level_2_split_test_y.h5.gz 3.04kB
filecamelyonpatch_level_2_split_test_meta.csv 1.61MB
filecamelyonpatch_level_2_split_test_x.h5.gz 800.88MB
Type: Dataset
Tags:

Bibtex:
@article{,
title= {The PatchCamelyon benchmark dataset (PCAM)},
keywords= {},
author= {Bas Veeling},
abstract= {The PatchCamelyon benchmark is a new and challenging image classification dataset. It consists of 327.680 color images (96 x 96px) extracted from histopathologic scans of lymph node sections. Each image is annoted with a binary label indicating presence of metastatic tissue. PCam provides a new benchmark for machine learning models: bigger than CIFAR10, smaller than imagenet, trainable on a single GPU.

## Why PCam
Fundamental machine learning advancements are predominantly evaluated on straight-forward natural-image classification datasets. Think MNIST, CIFAR, SVHN. Medical imaging is becoming one of the major applications of ML and we believe it deserves a spot on the list of go-to ML datasets. Both to challenge future work, and to steer developments into directions that are beneficial for this domain.

We think PCam can play a role in this. It packs the clinically-relevant task of metastasis detection into a straight-forward binary image classification task, akin to CIFAR-10 and MNIST. Models can easily be trained on a single GPU in a couple hours, and achieve competitive scores in the Camelyon16 tasks of tumor detection and WSI diagnosis. Furthermore, the balance between task-difficulty and tractability makes it a prime suspect for fundamental machine learning research on topics as active learning, model uncertainty and explainability.

https://github.com/basveeling/pcam/raw/master/pcam.jpg
},
terms= {},
license= {},
superseded= {},
url= {https://github.com/basveeling/pcam}
}

Hosted by users:

Send Feedback Start
   0.000007
DB Connect
   0.000522
Lookup hash in DB
   0.000969
Get torrent details
   0.001108
Get torrent details, finished
   0.001460
Get authors
   0.000398
Parse bibtex
   0.001427
Write header
   0.000860
get stars
   0.000809
home tab
   0.001034
render right panel
   0.000012
render ads
   0.000041
fetch current hosters
   0.000836
Done