Evaluation results

Mean explicit agreement between pairs of annotators was measured in several ways, differing in treatment of missing data. In 29-45% of the annotator-word assignment tasks, the annotator failed to make an assignment.

Mean explicit agreements ranged from 0.392 to 0.745.

Implicit agreements ranged from 0.784 to 0.945.