OpenWebText-urls-26M-filtered.xz
eukaryote and jcpeterson

OpenWebText-urls-26M-filtered.xz 480.28MB
Type: Dataset

Metadata:
@article{,
title= {OpenWebText-urls-26M-filtered.xz},
journal= {},
author= {eukaryote and jcpeterson},
year= {},
url= {https://github.com/eukaryote31/openwebtext},
abstract= {Every outbound reddit link from before 31. Dec 2018 with at least 3 karma. The list is filtered to remove image sites, non-scraper-friendly sites, and other media files. },
keywords= {WebText, Reddit, gpt2},
terms= {},
license= {},
superseded= {}
}

Citation:
eukaryote & jcpeterson. (2019). OpenWebText-urls-26M-filtered.xz [Data set]. Academic Torrents. https://academictorrents.com/details/f5161721b322bca66ed74da32b963c1066e64312
No stats to report yet.

Send Feedback Start
   0.000007
DB Connect
   0.000467
Lookup hash in DB
   0.000394
Get torrent details
   0.000197
Get torrent details, finished
   0.000262
Get authors
   0.000017
Parse bibtex
   0.000114
Write header
   0.000233
get stars
   0.000141
home tab
   0.000148
render right panel
   0.000004
render ads
   0.000343
fetch current hosters
   0.000223
Start get stats
   0.000348
End get stats
   0.000001
related datasets
   0.003408
Done