Wikipedia Training Data for Megatron-LM

folder wikipedia_bin (2 files)
filewiki_text_sentence.bin 6.29GB
filewiki_text_sentence.idx 1.55GB
Type: Dataset
Tags: BERT; NLP;

Bibtex:
@article{,
title= {Wikipedia Training Data for Megatron-LM},
journal= {},
author= {},
year= {},
url= {},
abstract= {A preprocessed dataset for https://github.com/NVIDIA/Megatron-LM training. Please see instructions in https://github.com/Lyken17/ML-Datasets for how to use it.

Note: the author does not own any copyrights of the data. },
keywords= {BERT; NLP;},
terms= {},
license= {},
superseded= {}
}


Send Feedback Start
   0.000005
DB Connect
   0.000344
Lookup hash in DB
   0.000604
Get torrent details
   0.000575
Get torrent details, finished
   0.000551
Get authors
   0.000005
Select authors
   0.000363
Parse bibtex
   0.000393
Write header
   0.000394
get stars
   0.000454
home tab
   0.000806
render right panel
   0.000009
render ads
   0.000043
fetch current hosters
   0.000575
Done