Back to model community

Automatic Speech Recognition - English 16 KHZ - powered by Modzy MLOps platform for Enterprise and Edge AI

Automatic Speech Recognition – English 16 KHZ

Model by AppTek

This model converts speech from English language 16Khz audio telephony and other files into text. It accepts audio files including WAV, MP3, PCM and other popular formats and outputs text in JSON, XML, TEXT or SRT formats. The model includes punctuation, capitalization, timecodes, word confidence score, and speaker diarization. This model can be used to transcribe telephone calls, meetings and interviews, and other 8Khz recorded content.

See the model in action with a Modzy MLOps platform demo or start a trial