Common Crawl corpus - training-parallel-commoncrawl.tgz (CS-EN, DE-EN, ES-EN, FR-EN, RU-EN)

Name DL Torrents Total Size
Text [edit]
RSS CSV
32 233.75GB 206 0

Hosted by users:

Send Feedback Start
   0.000010
DB Connect
   0.001153
Lookup hash in DB
   0.001116
Get torrent details
   0.000363
Get torrent details, finished
   0.000823
Get authors
   0.000001
Select authors
   0.000565
Parse bibtex
   0.000121
Write header
   0.000526
get stars
   0.000403
collections tab
   0.001671
render right panel
   0.000009
render ads
   0.001551
fetch current hosters
   0.001511
Done