Yelp Restaurant Photo Classification Data
Yelp

folder yelp-restaurant-photo-classification-data (5 files)
filetrain_photo_to_biz_ids.csv.tgz 1.17MB
filetrain_photos.tgz 7.03GB
filetrain.csv.tgz 7.29kB
filetest_photo_to_biz.csv.tgz 5.02MB
filetest_photos.tgz 7.10GB
Type: Dataset
Tags: yelp
Abstract:

At Yelp, there are lots of photos and lots of users uploading photos. These photos provide rich local business information across categories. Teaching a computer to understand the context of these photos is not an easy task. Yelp engineers work on deep learning image classification projects in-house, and you can read about them here.

In this competition, you are given photos that belong to a business and asked to predict the business attributes. There are 9 different attributes in this problem:

0: good_for_lunch
1: good_for_dinner
2: takes_reservations
3: outdoor_seating
4: restaurant_is_expensive
5: has_alcohol
6: has_table_service
7: ambience_is_classy
8: good_for_kids

These labels are annotated by the Yelp community. Your task is to predict these labels purely from the business photos uploaded by users.

Since Yelp is a community driven website, there are duplicated images in the dataset. They are mainly due to:

users accidentally upload the same photo to the same business more than once (e.g., this and this) chain businesses which upload the same photo to different branches Yelp is including these as part of the competition, since these are challenges Yelp researchers face every day.

File descriptions

train_photos.tgz - photos of the training set
test_photos.tgz - photos of the test set
train_photo_to_biz_ids.csv - maps the photo id to business id
test_photo_to_biz_ids.csv - maps the photo id to business id
train.csv - main training dataset. Includes the business id's, and their corresponding labels. 


URL: https://www.kaggle.com/c/yelp-restaurant-photo-classification
License: No license specified, the work may be protected by copyright.

Bibtex:
@article{,
title= {Yelp Restaurant Photo Classification Data},
keywords= {yelp},
journal= {},
author= {Yelp},
year= {},
url= {https://www.kaggle.com/c/yelp-restaurant-photo-classification},
license= {},
abstract= {At Yelp, there are lots of photos and lots of users uploading photos. These photos provide rich local business information across categories. Teaching a computer to understand the context of these photos is not an easy task. Yelp engineers work on deep learning image classification projects in-house, and you can read about them here. 

In this competition, you are given photos that belong to a business and asked to predict the business attributes. There are 9 different attributes in this problem:

	0: good_for_lunch
	1: good_for_dinner
	2: takes_reservations
	3: outdoor_seating
	4: restaurant_is_expensive
	5: has_alcohol
	6: has_table_service
	7: ambience_is_classy
	8: good_for_kids
		
These labels are annotated by the Yelp community. Your task is to predict these labels purely from the business photos uploaded by users. 

Since Yelp is a community driven website, there are duplicated images in the dataset. They are mainly due to:

users accidentally upload the same photo to the same business more than once (e.g., this and this)
chain businesses which upload the same photo to different branches
Yelp is including these as part of the competition, since these are challenges Yelp researchers face every day. 

File descriptions

	train_photos.tgz - photos of the training set
	test_photos.tgz - photos of the test set
	train_photo_to_biz_ids.csv - maps the photo id to business id
	test_photo_to_biz_ids.csv - maps the photo id to business id
	train.csv - main training dataset. Includes the business id's, and their corresponding labels. },
superseded= {},
terms= {}
}

Hosted by users:

Send Feedback Start
   0.000004
DB Connect
   0.000343
Lookup hash in DB
   0.012750
Get torrent details
   0.000578
Get torrent details, finished
   0.000689
Get authors
   0.000070
Parse bibtex
   0.000912
Write header
   0.000604
get stars
   0.001665
home tab
   0.007578
render right panel
   0.000095
render ads
   0.000128
fetch current hosters
   0.002550
Done