OpenWebText-urls-26M-filtered.xz
eukaryote and jcpeterson

OpenWebText-urls-26M-filtered.xz 480.28MB
Type: Dataset
Tags: WebText, Reddit, gpt2

Bibtex:
@article{,
title= {OpenWebText-urls-26M-filtered.xz},
journal= {},
author= {eukaryote and jcpeterson},
year= {},
url= {https://github.com/eukaryote31/openwebtext},
abstract= {Every outbound reddit link from before 31. Dec 2018 with at least 3 karma. The list is filtered to remove image sites, non-scraper-friendly sites, and other media files. },
keywords= {WebText, Reddit, gpt2},
terms= {},
license= {},
superseded= {}
}


Send Feedback Start
   0.000005
DB Connect
   0.000400
Lookup hash in DB
   0.000712
Get torrent details
   0.000530
Get torrent details, finished
   0.000544
Get authors
   0.000042
Parse bibtex
   0.000431
Write header
   0.000494
get stars
   0.000439
home tab
   0.000445
render right panel
   0.000007
render ads
   0.000041
fetch current hosters
   0.000531
Done