Wikipedia Training Data for Megatron-LM

folder wikipedia_bin (2 files)
filewiki_text_sentence.bin 6.29GB
filewiki_text_sentence.idx 1.55GB
Type: Dataset
Tags: BERT; NLP;

Bibtex:
@article{,
title= {Wikipedia Training Data for Megatron-LM},
journal= {},
author= {},
year= {},
url= {},
abstract= {A preprocessed dataset for https://github.com/NVIDIA/Megatron-LM training. Please see instructions in https://github.com/Lyken17/ML-Datasets for how to use it.

Note: the author does not own any copyrights of the data. },
keywords= {BERT; NLP;},
terms= {},
license= {},
superseded= {}
}


Send Feedback Start
   0.000005
DB Connect
   0.000565
Lookup hash in DB
   0.003716
Get torrent details
   0.004726
Get torrent details, finished
   0.000769
Get authors
   0.000004
Select authors
   0.002744
Parse bibtex
   0.000718
Write header
   0.000663
get stars
   0.000651
home tab
   0.008493
render right panel
   0.000042
render ads
   0.000045
fetch current hosters
   0.007551
Done