Common Crawl corpus - training-parallel-commoncrawl.tgz (CS-EN, DE-EN, ES-EN, FR-EN, RU-EN)

Name DL Added Torrents Total Size
Text [edit]
RSS CSV
32 233.75GB 263 0

Hosted by users:

Send Feedback Start
   0.000006
DB Connect
   0.000454
Lookup hash in DB
   0.000394
Get torrent details
   0.000117
Get torrent details, finished
   0.000241
Get authors
   0.000001
Select authors
   0.000146
Parse bibtex
   0.000060
Write header
   0.000192
get stars
   0.000297
collections tab
   0.000696
render right panel
   0.000006
render ads
   0.000435
fetch current hosters
   0.000570
Done