OpenWebText-urls-26M-filtered.xz
eukaryote and jcpeterson

OpenWebText-urls-26M-filtered.xz 480.28MB
Type: Dataset
Tags: WebText, Reddit, gpt2

Bibtex:
@article{,
title= {OpenWebText-urls-26M-filtered.xz},
journal= {},
author= {eukaryote and jcpeterson},
year= {},
url= {https://github.com/eukaryote31/openwebtext},
abstract= {Every outbound reddit link from before 31. Dec 2018 with at least 3 karma. The list is filtered to remove image sites, non-scraper-friendly sites, and other media files. },
keywords= {WebText, Reddit, gpt2},
terms= {},
license= {},
superseded= {}
}


Send Feedback Start
   0.000008
DB Connect
   0.000451
Lookup hash in DB
   0.001903
Get torrent details
   0.003993
Get torrent details, finished
   0.000739
Get authors
   0.000063
Parse bibtex
   0.000718
Write header
   0.000693
get stars
   0.000490
home tab
   0.000478
render right panel
   0.000012
render ads
   0.000062
fetch current hosters
   0.000546
Done