Wikipedia Training Data for Megatron-LM

folder wikipedia_bin (2 files)
filewiki_text_sentence.bin 6.29GB
filewiki_text_sentence.idx 1.55GB
Type: Dataset
Tags: BERT; NLP;

Bibtex:
@article{,
title= {Wikipedia Training Data for Megatron-LM},
journal= {},
author= {},
year= {},
url= {},
abstract= {A preprocessed dataset for https://github.com/NVIDIA/Megatron-LM training. Please see instructions in https://github.com/Lyken17/ML-Datasets for how to use it.

Note: the author does not own any copyrights of the data. },
keywords= {BERT; NLP;},
terms= {},
license= {},
superseded= {}
}


Send Feedback Start
   0.000003
DB Connect
   0.000297
Lookup hash in DB
   0.001750
Get torrent details
   0.001614
Get torrent details, finished
   0.000546
Get authors
   0.000002
Select authors
   0.000521
Parse bibtex
   0.000196
Write header
   0.000465
get stars
   0.000564
home tab
   0.014280
render right panel
   0.000010
render ads
   0.000035
fetch current hosters
   0.003030
Done