Abstract: 2,909,551 news articles from the SogouCA and SogouCS news corpora, in 5 categories. The number of training samples selected for each class is 90,000 and testing 12,000. Note that the Chinese characters have been converted to Pinyin.
2,909,551 news articles from the SogouCA and SogouCS news corpora, in 5 categories. The number of training samples selected for each class is 90,000 and testing 12,000. Note that the Chinese characters have been converted to Pinyin.URL: https://arxiv.org/abs/1509.01626License: No license specified, the work may be protected by copyright.
@article{,
title= {Sogou news},
keywords= {fastai},
journal= {},
author= {Xiang Zhang et al., 2015},
year= {},
url= {https://arxiv.org/abs/1509.01626},
license= {},
abstract= {2,909,551 news articles from the SogouCA and SogouCS news corpora, in 5 categories. The number of training samples selected for each class is 90,000 and testing 12,000. Note that the Chinese characters have been converted to Pinyin.},
superseded= {},
terms= {}
}