neuralmonkey.evaluators.wer module

class neuralmonkey.evaluators.wer.WEREvaluator(name: str = None) → None

Bases: neuralmonkey.evaluators.evaluator.Evaluator

Compute WER (word error rate, used in speech recognition).

static compare_scores(score2: float) → int

Compare scores using this evaluator.

The default implementation regards the bigger score as better.

  • score1 – The first score.
  • score2 – The second score.
An int. When score1 is better, returns 1. When score2 is better, returns -1. When the scores are equal, returns 0.
score_batch(hypotheses: List[List[str]], references: List[List[str]]) → float

Score a batch of hyp/ref pairs.

The default implementation of this method calls score_instance for each instance in the batch and returns the average score.

  • hypotheses – List of model predictions.
  • references – List of golden outputs.

A float.

score_instance(hypothesis: List[str], reference: List[str]) → float

Score a single hyp/ref pair.

The default implementation of this method returns 1.0 when the hypothesis and the reference are equal and 0.0 otherwise.

  • hypothesis – The model prediction.
  • reference – The golden output.

A float.