OpenWebText-urls-26M-filtered.xz
eukaryote and jcpeterson

OpenWebText-urls-26M-filtered.xz 480.28MB
Type: Dataset

Metadata:
@article{,
title= {OpenWebText-urls-26M-filtered.xz},
journal= {},
author= {eukaryote and jcpeterson},
year= {},
url= {https://github.com/eukaryote31/openwebtext},
abstract= {Every outbound reddit link from before 31. Dec 2018 with at least 3 karma. The list is filtered to remove image sites, non-scraper-friendly sites, and other media files. },
keywords= {WebText, Reddit, gpt2},
terms= {},
license= {},
superseded= {}
}

Citation:
eukaryote & jcpeterson. (2019). OpenWebText-urls-26M-filtered.xz [Data set]. Academic Torrents. https://academictorrents.com/details/f5161721b322bca66ed74da32b963c1066e64312

Send Feedback Start
   0.000006
DB Connect
   0.000495
Lookup hash in DB
   0.000450
Get torrent details
   0.000135
Get torrent details, finished
   0.000277
Get authors
   0.000019
Parse bibtex
   0.000155
Write header
   0.000246
get stars
   0.000124
home tab
   0.000162
render right panel
   0.000004
render ads
   0.000370
fetch current hosters
   0.000229
related datasets
   0.003160
Done