Multi-Language OCR

Model by Open Source

This model converts scanned images of text embedded images into electronic text. It takes as input scanned images in multiple formats, including JPG, PNG, and many others; the input can include tables. It produces output in PDF, TSV, plain text, and other formats. The model also supports user-supplied patterns and words, and 107 writing systems (scripts) and languages. It does not process color images or recognize handwriting. This model can be used in multiple ways, such as recovering electronic text from printouts, archival paper documents, and web pages containing only images of text.

