Wikipedia Training Data for Megatron-LM

folder wikipedia_bin (2 files)
filewiki_text_sentence.bin 6.29GB
filewiki_text_sentence.idx 1.55GB
Type: Dataset
Tags: BERT; NLP;

Bibtex:
@article{,
title= {Wikipedia Training Data for Megatron-LM},
journal= {},
author= {},
year= {},
url= {},
abstract= {A preprocessed dataset for https://github.com/NVIDIA/Megatron-LM training. Please see instructions in https://github.com/Lyken17/ML-Datasets for how to use it.

Note: the author does not own any copyrights of the data. },
keywords= {BERT; NLP;},
terms= {},
license= {},
superseded= {}
}


Send Feedback Start
   0.000005
DB Connect
   0.000500
Lookup hash in DB
   0.000537
Get torrent details
   0.002942
Get torrent details, finished
   0.000544
Get authors
   0.000005
Select authors
   0.001390
Parse bibtex
   0.000407
Write header
   0.000370
get stars
   0.000424
home tab
   0.003959
render right panel
   0.000012
render ads
   0.000046
fetch current hosters
   0.000551
Done