Named entity recognition, also known as entity extraction, detects and classifies the entities in a piece of English text into four categories: persons, locations, organizations, and miscellaneous. The input to this model is an English text and the output is each word in the text labeled as one of these four categories or ‘O’ indicating that the word is not an entity. This model can add a wealth of semantic knowledge to any text content and helps with the understanding of the subject covered in any given text.
89.7% F1 Score, 98.2% Precision, and 90.6% Recall
This model was trained on the CoNLL-2003 training dataset. This model obtains a precision score of 98.15%, recall score of 90.61%, and F1 score of 89.72% on the CoNLL-2003 validation dataset. The dataset is a collection of news wire articles from the Reuters Corpus. Some of this model’s strengths include its high precision of close to 95.19% in detecting person names and its ability to understand the grammar and semantic relationships between words in a piece of text. This model has a lower accuracy in detecting miscellaneous entities in a text.
F1 is the harmonic mean of the precision and recall, with best value of 1. It measures the balance between the two metrics. Further information here.
A higher precision score indicates that the majority of labels predicted by the model for different classes are accurate. Further information here.
A higher recall score indicates that the model finds and predicts correct labels for the majority of the classes it is supposed to find. Further information here.
This model utilizes Google’s BERT (Bidirectional Encoder Representations from Transformers) that last year topped several records for how well models can handle language-based tasks. This model is based on a transformer architecture pre-trained in a bidirectional way and uses an attention mechanism. BERT is one of the pioneer models designed based on the ‘attention and pre-training is all you need’ philosophy which attempts to expand transfer learning across different textual domains in natural language processing related tasks. This model uses the TensorFlow deep learning framework.
This model was trained on the CoNLL-2003 training dataset. Transfer learning was used to fine-tune the BERT model for the specific task of NER on the training dataset with data batches of 32 for 4 epochs, using stochastic gradient descent with a learning rate of 0.00005. The model has 1024 hidden layers with 340 million trainable variables; the training took more than a day on 1 Nvidia Tesla 16GB DGX-1 GPU.
The performance of the model was tested on CoNLL-2003 validation dataset, which contained 3.6K sentences.
The input(s) to this model must adhere to the following specifications:
This model will output the following:
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience and Modzy product offering.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.