Wikipedia Training Data for Megatron-LM

folder wikipedia_bin (2 files)
filewiki_text_sentence.bin 6.29GB
filewiki_text_sentence.idx 1.55GB
Type: Dataset
Tags: BERT; NLP;

Bibtex:
@article{,
title= {Wikipedia Training Data for Megatron-LM},
journal= {},
author= {},
year= {},
url= {},
abstract= {A preprocessed dataset for https://github.com/NVIDIA/Megatron-LM training. Please see instructions in https://github.com/Lyken17/ML-Datasets for how to use it.

Note: the author does not own any copyrights of the data. },
keywords= {BERT; NLP;},
terms= {},
license= {},
superseded= {}
}


Send Feedback Start
   0.000005
DB Connect
   0.000417
Lookup hash in DB
   0.001658
Get torrent details
   0.003242
Get torrent details, finished
   0.000697
Get authors
   0.000005
Select authors
   0.002617
Parse bibtex
   0.000481
Write header
   0.000529
get stars
   0.002666
home tab
   0.011146
render right panel
   0.000039
render ads
   0.000109
fetch current hosters
   0.010661
Done