The PatchCamelyon benchmark dataset (PCAM)
Bas Veeling

folder pcam (10 files)
filecamelyonpatch_level_2_split_valid_y.h5.gz 3.04kB
filecamelyonpatch_level_2_split_valid_x.h5.gz 805.97MB
filecamelyonpatch_level_2_split_train_y.h5.gz 21.38kB
filecamelyonpatch_level_2_split_valid_meta.csv 1.85MB
filecamelyonpatch_level_2_split_train_x.h5.gz 6.42GB
filecamelyonpatch_level_2_split_train_mask.h5.gz 14.48MB
filecamelyonpatch_level_2_split_train_meta.csv 15.05MB
filecamelyonpatch_level_2_split_test_y.h5.gz 3.04kB
filecamelyonpatch_level_2_split_test_meta.csv 1.61MB
filecamelyonpatch_level_2_split_test_x.h5.gz 800.88MB
Type: Dataset
Tags:

Bibtex:
@article{,
title= {The PatchCamelyon benchmark dataset (PCAM)},
keywords= {},
author= {Bas Veeling},
abstract= {The PatchCamelyon benchmark is a new and challenging image classification dataset. It consists of 327.680 color images (96 x 96px) extracted from histopathologic scans of lymph node sections. Each image is annoted with a binary label indicating presence of metastatic tissue. PCam provides a new benchmark for machine learning models: bigger than CIFAR10, smaller than imagenet, trainable on a single GPU.

## Why PCam
Fundamental machine learning advancements are predominantly evaluated on straight-forward natural-image classification datasets. Think MNIST, CIFAR, SVHN. Medical imaging is becoming one of the major applications of ML and we believe it deserves a spot on the list of go-to ML datasets. Both to challenge future work, and to steer developments into directions that are beneficial for this domain.

We think PCam can play a role in this. It packs the clinically-relevant task of metastasis detection into a straight-forward binary image classification task, akin to CIFAR-10 and MNIST. Models can easily be trained on a single GPU in a couple hours, and achieve competitive scores in the Camelyon16 tasks of tumor detection and WSI diagnosis. Furthermore, the balance between task-difficulty and tractability makes it a prime suspect for fundamental machine learning research on topics as active learning, model uncertainty and explainability.

https://github.com/basveeling/pcam/raw/master/pcam.jpg
},
terms= {},
license= {},
superseded= {},
url= {https://github.com/basveeling/pcam}
}

Hosted by users:

Send Feedback Start
   0.000005
DB Connect
   0.000456
Lookup hash in DB
   0.000643
Get torrent details
   0.000718
Get torrent details, finished
   0.000846
Get authors
   0.000060
Parse bibtex
   0.000493
Write header
   0.000533
get stars
   0.000371
home tab
   0.000996
render right panel
   0.000012
render ads
   0.000071
fetch current hosters
   0.001724
Done