This model converts speech from Korean language 16Khz audio telephony files into text. It accepts audio files including MP4, WMV, WAV, MP3, and other popular formats and outputs text in JSON, XML, TEXT or SRT formats. The model includes punctuation, capitalization, timecodes, word confidence score, and speaker diarization. This model can be used to transcribe news channels along with media and entertainment, create rich metadata from media archives, and to generate captions from audio and video sources.
AppTek’s ASR models achieve approximately the same accuracy on real world data as the top cloud service providers. When we build ASR systems for academic tasks, following the comparable training and evaluation conditions as other ASR teams in the community, we achieve state of the art results on popular US English benchmark tasks like LibriSpeech (5.5% on “test-other”) or Switchboard (11.7% on “Hub5 2000 eval”).
AppTek’s acoustic models are backed by bi-directional recurrent neural networks with LSTM units. The models are trained using the RETURNN toolkit — a software package for neural sequence-to-sequence models, developed jointly by of the RWTH Aachen University, Germany and AppTek. The toolkit is built upon the TensorFlow backend and allows flexible and efficient specification, training, and deployment of different neural models.
AppTek trains all ASR models on very large collections of annotated audio data. We compile the training data from a wide variety of sources in order to achieve a high level of generalization.
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience and Modzy product offering.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.