Common Crawl corpus - training-parallel-commoncrawl.tgz (CS-EN, DE-EN, ES-EN, FR-EN, RU-EN)

Name DL Added Torrents Total Size
Text [edit]
RSS CSV
32 233.75GB 260 0
No stats to report yet.

Send Feedback Start
   0.000007
DB Connect
   0.000517
Lookup hash in DB
   0.000470
Get torrent details
   0.000142
Get torrent details, finished
   0.000254
Get authors
   0.000001
Select authors
   0.000166
Parse bibtex
   0.000174
Write header
   0.000356
get stars
   0.000123
collections tab
   0.000687
render right panel
   0.000009
render ads
   0.000462
fetch current hosters
   0.000232
Start get stats
   0.000364
End get stats
   0.000002
related datasets
   0.001836
Done