Wikipedia Training Data for Megatron-LM

folder wikipedia_bin (2 files)
filewiki_text_sentence.bin 6.29GB
filewiki_text_sentence.idx 1.55GB
Type: Dataset
Tags: BERT; NLP;

Bibtex:
@article{,
title= {Wikipedia Training Data for Megatron-LM},
journal= {},
author= {},
year= {},
url= {},
abstract= {A preprocessed dataset for https://github.com/NVIDIA/Megatron-LM training. Please see instructions in https://github.com/Lyken17/ML-Datasets for how to use it.

Note: the author does not own any copyrights of the data. },
keywords= {BERT; NLP;},
terms= {},
license= {},
superseded= {}
}


Send Feedback Start
   0.000013
DB Connect
   0.000895
Lookup hash in DB
   0.003927
Get torrent details
   0.003810
Get torrent details, finished
   0.000933
Get authors
   0.000002
Select authors
   0.000545
Parse bibtex
   0.000201
Write header
   0.000623
get stars
   0.000416
home tab
   0.007652
render right panel
   0.000009
render ads
   0.001003
fetch current hosters
   0.000752
Done