Wikipedia Training Data for Megatron-LM

folder wikipedia_bin (2 files)
filewiki_text_sentence.bin 6.29GB
filewiki_text_sentence.idx 1.55GB
Type: Dataset
Tags: BERT; NLP;

Bibtex:
@article{,
title= {Wikipedia Training Data for Megatron-LM},
journal= {},
author= {},
year= {},
url= {},
abstract= {A preprocessed dataset for https://github.com/NVIDIA/Megatron-LM training. Please see instructions in https://github.com/Lyken17/ML-Datasets for how to use it.

Note: the author does not own any copyrights of the data. },
keywords= {BERT; NLP;},
terms= {},
license= {},
superseded= {}
}


Send Feedback Start
   0.000006
DB Connect
   0.001247
Lookup hash in DB
   0.001215
Get torrent details
   0.000408
Get torrent details, finished
   0.000699
Get authors
   0.000001
Select authors
   0.000477
Parse bibtex
   0.000064
Write header
   0.000443
get stars
   0.000398
home tab
   0.002843
render right panel
   0.000007
render ads
   0.001225
fetch current hosters
   0.001522
Done