neuralmonkey.evaluators.evaluator module¶
-
class
neuralmonkey.evaluators.evaluator.
Evaluator
(name: str = None) → None¶ Bases:
typing.Generic
Base class for evaluators in Neural Monkey.
Each evaluator has a __call__ method which returns a score for a batch of model predictions given a the references. This class provides default implementations of score_batch and score_instance functions.
-
__init__
(name: str = None) → None¶ Initialize self. See help(type(self)) for accurate signature.
-
static
compare_scores
(score2: float) → int¶ Compare scores using this evaluator.
The default implementation regards the bigger score as better.
Parameters: - score1 – The first score.
- score2 – The second score.
- Returns
- An int. When score1 is better, returns 1. When score2 is better, returns -1. When the scores are equal, returns 0.
-
name
¶
-
score_batch
(hypotheses: List[EvalType], references: List[EvalType]) → float¶ Score a batch of hyp/ref pairs.
The default implementation of this method calls score_instance for each instance in the batch and returns the average score.
Parameters: - hypotheses – List of model predictions.
- references – List of golden outputs.
Returns: A float.
-
score_instance
(hypothesis: EvalType, reference: EvalType) → float¶ Score a single hyp/ref pair.
The default implementation of this method returns 1.0 when the hypothesis and the reference are equal and 0.0 otherwise.
Parameters: - hypothesis – The model prediction.
- reference – The golden output.
Returns: A float.
-
-
class
neuralmonkey.evaluators.evaluator.
SequenceEvaluator
(name: str = None) → None¶ Bases:
neuralmonkey.evaluators.evaluator.Evaluator
Base class for token-level evaluators that work with sequences.
-
score_batch
(hypotheses: List[Sequence[EvalType]], references: List[Sequence[EvalType]]) → float¶ Score batch of sequences.
The default implementation assumes equal sequence lengths and operates on the token level (i.e. token-level scores from the whole batch are averaged (in contrast to averaging each sequence first)).
Parameters: - hypotheses – List of model predictions.
- references – List of golden outputs.
Returns: A float.
-
score_token
(hyp_token: EvalType, ref_token: EvalType) → float¶ Score a single hyp/ref pair of tokens.
The default implementation returns 1.0 if the tokens are equal, 0.0 otherwise.
Parameters: - hyp_token – A prediction token.
- ref_token – A golden token.
Returns: A score for the token hyp/ref pair.
-
-
neuralmonkey.evaluators.evaluator.
check_lengths
(scorer)¶