This model translates text from Korean to English. It expects UTF-8 Korean text as input and outputs English text.
This model is trained on the News-Commentary, QED, bible, JW300, OpenSubtitles, Tatoeba, KPC, and Tanzil parallel corpora. This model was tested on 2,000 sentences from the Korean Parallel Corpus and achieves a Bleu score of 10.22. The Bleu is a commonly used metric for translation tasks and it is a way of comparing model generated text to a gold standard to see how similar they are.
10.2% Bleu Score – A method for assessing the quality of text that has been machine-translated from one language to another. The closer the machine translation is to expert human translation, the better the score. Further information here.
This model uses the Google Transformer architecture, which is currently the base for many state-of-the-art translation models. The essence of the Transformer model is the encoder-decoder architecture with Attention. Multiple encoders are stacked on top of each other with each one consisting of a self-attention layer, to try and consider the full sentence when translating instead of just the word that it is looking at, and a feed forward neural network. Word embeddings are fed through these encoding layers and are then passed into the decoding layers.
In the decoding layer, the self-attention layer only pays attention to earlier positions as opposed to the encode which allows both directions. Otherwise they work the same way except they attempt to decode the input into the output language. All units have an add and normalize layer as well. Additionally, in the standard transformer model, there are eight attention heads which are used. The results of these are concatenated into the feed forward network and then reduced to the correct size. Positional encoding is also used to account for the order of words in the input sequence.
This model is trained on the News-Commentary, QED, Bible, JW300, OpenSubtitles, Tatoeba, KPC, and Tanzil parallel corpora for 200,000 steps on 4 GPU’s, and was implemented using the open source OpenNMT framework. The bulk of its religious texts and can be found at Opus.
This model was validated on 2,000 parallel sentences and achieves a Bleu Score of 10.22%.
The input(s) to this model must adhere to the following specifications:
This model will output the following:
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience and Modzy product offering.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.