Wikipedia Training Data for Megatron-LM

folder wikipedia_bin (2 files)
filewiki_text_sentence.bin 6.29GB
filewiki_text_sentence.idx 1.55GB
Type: Dataset
Tags: BERT; NLP;

Bibtex:
@article{,
title= {Wikipedia Training Data for Megatron-LM},
journal= {},
author= {},
year= {},
url= {},
abstract= {A preprocessed dataset for https://github.com/NVIDIA/Megatron-LM training. Please see instructions in https://github.com/Lyken17/ML-Datasets for how to use it.

Note: the author does not own any copyrights of the data. },
keywords= {BERT; NLP;},
terms= {},
license= {},
superseded= {}
}



Send Feedback Start
   0.000008
DB Connect
   0.000939
Lookup hash in DB
   0.001022
Get torrent details
   0.000335
Get torrent details, finished
   0.000627
Get authors
   0.000001
Select authors
   0.000407
Parse bibtex
   0.000117
Write header
   0.000396
get stars
   0.000278
home tab
   0.002608
render right panel
   0.000010
render ads
   0.001027
fetch current hosters
   0.000973
Done