Wikipedia Training Data for Megatron-LM

folder wikipedia_bin (2 files)
filewiki_text_sentence.bin 6.29GB
filewiki_text_sentence.idx 1.55GB
Type: Dataset
Tags: BERT; NLP;

Bibtex:
@article{,
title= {Wikipedia Training Data for Megatron-LM},
journal= {},
author= {},
year= {},
url= {},
abstract= {A preprocessed dataset for https://github.com/NVIDIA/Megatron-LM training. Please see instructions in https://github.com/Lyken17/ML-Datasets for how to use it.

Note: the author does not own any copyrights of the data. },
keywords= {BERT; NLP;},
terms= {},
license= {},
superseded= {}
}


Send Feedback Start
   0.000006
DB Connect
   0.000346
Lookup hash in DB
   0.000917
Get torrent details
   0.027122
Get torrent details, finished
   0.000997
Get authors
   0.000006
Select authors
   0.000509
Parse bibtex
   0.000432
Write header
   0.000480
get stars
   0.000504
home tab
   0.042828
render right panel
   0.000013
render ads
   0.000048
fetch current hosters
   0.000724
Done