OpenWebText-urls-26M-filtered.xz
eukaryote and jcpeterson

OpenWebText-urls-26M-filtered.xz480.28MB
Type: Dataset
Tags: WebText, Reddit, gpt2

Bibtex:
@article{,
title= {OpenWebText-urls-26M-filtered.xz},
journal= {},
author= {eukaryote and jcpeterson},
year= {},
url= {https://github.com/eukaryote31/openwebtext},
abstract= {Every outbound reddit link from before 31. Dec 2018 with at least 3 karma. The list is filtered to remove image sites, non-scraper-friendly sites, and other media files. },
keywords= {WebText, Reddit, gpt2},
terms= {},
license= {},
superseded= {}
}


Send Feedback Start
   0.000006
DB Connect
   0.000527
Lookup hash in DB
   0.000812
Get torrent details
   0.000669
Get torrent details, finished
   0.000779
Get authors
   0.000067
Parse bibtex
   0.000723
Write header
   0.000704
get stars
   0.000596
home tab
   0.000573
render right panel
   0.000015
render ads
   0.000075
fetch current hosters
   0.000885
Done