Mention some of the evaluation metrics commonly used which are proposed by NER forums and explain them.

687    Asked by NakagawaHarada in Data Science , Asked on Nov 4, 2019
Answered by Nakagawa Harada

Many evaluation metrics have been developed to obtain better accuracy and increase the overall performance of any approach of NER. Some of them which are developed and brought forward by NER forums are-

a) CoNLL- Computational Natural Language Learning

b) ACE- Automatic Content Extraction

c) MUC- Message Understanding Conference

d) SemEval- Semantic Evaluation

Computational Natural Language Learning.

This evaluation takes all the three metrics- precision, recall and F1-score into account and they are evaluated based on the below mentioned scenarios.

a) Match between entity type and surface string

b) System hypothesized an entity

c) Missing an entity by system.

Message Understanding Conference

Message Understanding Conference is a model evaluation system which compares the response of a system against the golden annotation. The responses are evaluated based on the following metrics.

a) Correct (COR) where both the responses are same

b) Incorrect (INC) where the responses do not match

c) Partial (PAR) where responses are somewhat similar

d) Missing (MIS) where response of golden annotation does not appear

SemEval system.

SemEval is an evaluation technique used in NER which performs semantic analysis and they are implemented to explore the meaning of words in a sentence or a document. They introduced four different ways for evaluation. They are

Strict evaluation- exact match between surface string and entity

Exact Evaluation- exact match of surface string

Partial Evaluation- partial match of surface string

Type Evaluation- overlap between entities of system and gold annotation.

The below representation shows how the SemEval system works.


Your Answer


Parent Categories