OpenWebText-urls-26M-filtered.xz
eukaryote and jcpeterson

OpenWebText-urls-26M-filtered.xz 480.28MB
Type: Dataset

Bibtex:
@article{,
title= {OpenWebText-urls-26M-filtered.xz},
journal= {},
author= {eukaryote and jcpeterson},
year= {},
url= {https://github.com/eukaryote31/openwebtext},
abstract= {Every outbound reddit link from before 31. Dec 2018 with at least 3 karma. The list is filtered to remove image sites, non-scraper-friendly sites, and other media files. },
keywords= {WebText, Reddit, gpt2},
terms= {},
license= {},
superseded= {}
}


Send Feedback Start
   0.000007
DB Connect
   0.000439
Lookup hash in DB
   0.000396
Get torrent details
   0.000111
Get torrent details, finished
   0.000200
Get authors
   0.000017
Parse bibtex
   0.000053
Write header
   0.000178
get stars
   0.000161
home tab
   0.000156
render right panel
   0.000008
render ads
   0.000314
fetch current hosters
   0.000213
related datasets
   0.003517
Done