Automatic Speech Recognition – Italian 16 KHZ

Model by AppTek

This model converts speech from Italian language 16Khz audio telephony files into text. It accepts audio files including MP4, WMV, WAV, MP3, and other popular formats and outputs text in JSON, XML, TEXT or SRT formats. The model includes punctuation, capitalization, timecodes, word confidence score, and speaker diarization. This model can be used to transcribe news channels along with media and entertainment, create rich metadata from media archives, and to generate captions from audio and video sources.


Many models are available for limited use in the free Modzy Basic account.