OPUS Russian Open Speech To Text Dataset v1.01
Anna Slizhikova and Alexander Veysov and Dilyara Nurtdinova and Dmitry Voronin

folder ru_open_stt_opus (38 files)
filemanifests/tts_russian_addresses_rhvoice_4voices.csv 220.26MB
filemanifests/radio_v4_manifest.csv 515.81MB
filemanifests/radio_v4_add_manifest.csv 7.03MB
filemanifests/radio_pspeech_sample_manifest.csv 32.76MB
filemanifests/radio_2.csv 43.04MB
filemanifests/public_youtube700_val.csv 679.15kB
filemanifests/public_youtube700.csv 74.60MB
filemanifests/public_youtube1120_hq.csv 39.34MB
filemanifests/public_youtube1120.csv 141.83MB
filemanifests/public_speech_manifest.csv 132.35MB
filemanifests/public_series_1.csv 1.92MB
filemanifests/public_lecture_1.csv 660.11kB
filemanifests/private_buriy_audiobooks_2.csv 119.40MB
filemanifests/buriy_audiobooks_2_val.csv 744.95kB
filemanifests/asr_public_stories_2.csv 7.19MB
filemanifests/asr_public_stories_1.csv 4.84MB
filemanifests/asr_public_phone_calls_2.csv 60.34MB
filemanifests/asr_public_phone_calls_1.csv 26.39MB
filemanifests/asr_calls_2_val.csv 1.05MB
filearchives/tts_russian_addresses_rhvoice_4voices.tar.gz 13.86GB
filearchives/radio_v4_manifest.tar.gz 189.01GB
filearchives/radio_v4_add_manifest.tar.gz 3.04GB
filearchives/radio_pspeech_sample_manifest.tar.gz 12.27GB
filearchives/radio_2.tar.gz 26.45GB
filearchives/public_youtube700_val.tar.gz 469.33MB
filearchives/public_youtube700.tar.gz 13.09GB
filearchives/public_youtube1120_hq.tar.gz 5.31GB
filearchives/public_youtube1120.tar.gz 20.43GB
filearchives/public_speech_manifest.tar.gz 50.94GB
filearchives/public_series_1.tar.gz 319.23MB
filearchives/public_lecture_1.tar.gz 122.51MB
filearchives/private_buriy_audiobooks_2.tar.gz 27.74GB
filearchives/buriy_audiobooks_2_val.tar.gz 496.48MB
filearchives/asr_public_stories_2.tar.gz 1.50GB
filearchives/asr_public_stories_1.tar.gz 719.09MB
filearchives/asr_public_phone_calls_2.tar.gz 10.12GB
filearchives/asr_public_phone_calls_1.tar.gz 3.41GB
filearchives/asr_calls_2_val.tar.gz 805.25MB
Type: Dataset

Metadata:
@article{,
title= {OPUS Russian Open Speech To Text Dataset v1.01},
journal= {},
author= {Anna Slizhikova and Alexander Veysov and Dilyara Nurtdinova and Dmitry Voronin},
year= {},
url= {https://github.com/snakers4/open_stt/},
abstract= {v1.0-beta 

Arguably the largest public Russian STT dataset up to date:
15m utterances;
20 000 hours;
2.3 TB (in mono .wav format in int16);

For more information please visit  https://github.com/snakers4/open_stt/},
keywords= {Dataset, russian, asr, stt, TTS},
terms= {https://github.com/snakers4/open_stt/#license},
license= {CC-NC-BY},
superseded= {}
}

Citation:
Slizhikova, A., Veysov, A., Nurtdinova, D., & Voronin, D.. (2020). OPUS Russian Open Speech To Text Dataset v1.01 [Data set]. Academic Torrents. https://academictorrents.com/details/95b4cab0f99850e119114c8b6df00193ab5fa34f
No stats to report yet.

Send Feedback Start
   0.000009
DB Connect
   0.000544
Lookup hash in DB
   0.000476
Get torrent details
   0.000129
Get torrent details, finished
   0.000232
Get authors
   0.000023
Parse bibtex
   0.000151
Write header
   0.000279
get stars
   0.000114
home tab
   0.000454
render right panel
   0.000006
render ads
   0.000524
fetch current hosters
   0.000241
Start get stats
   0.000362
End get stats
   0.000001
related datasets
   0.005031
Done