Common Crawl corpus - training-parallel-commoncrawl.tgz (CS-EN, DE-EN, ES-EN, FR-EN, RU-EN)

Name DL Torrents Total Size
Text [edit]
RSS CSV
32 233.75GB 219 0
Hosted by users:

Send Feedback Start
   0.000005
DB Connect
   0.000472
Lookup hash in DB
   0.000802
Get torrent details
   0.000680
Get torrent details, finished
   0.000690
Get authors
   0.000005
Select authors
   0.000484
Parse bibtex
   0.000427
Write header
   0.000576
get stars
   0.000531
collections tab
   0.001850
home tab
   0.000597
render right panel
   0.000019
render ads
   0.000073
fetch current hosters
   0.000896
Done