neuralmonkey.encoders.attentive module

class neuralmonkey.encoders.attentive.AttentiveEncoder(name: str, input_sequence: Union[neuralmonkey.model.stateful.TemporalStateful, neuralmonkey.model.stateful.SpatialStateful], hidden_size: int, num_heads: int, output_size: int = None, state_proj_size: int = None, dropout_keep_prob: float = 1.0, reuse: neuralmonkey.model.model_part.ModelPart = None, save_checkpoint: str = None, load_checkpoint: str = None, initializers: List[Tuple[str, Callable]] = None) → None

Bases: neuralmonkey.model.model_part.ModelPart, neuralmonkey.model.stateful.TemporalStatefulWithOutput

An encoder with attention over the input and a fixed-dimension output.

Based on “A Structured Self-attentive Sentence Embedding”, https://arxiv.org/abs/1703.03130.

The encoder combines a sequence of vectors into a fixed-size matrix where each row of the matrix is computed using a different attention head. This matrix is exposed as the temporal_states property (the time dimension corresponds to the different attention heads). The output property provides a flattened and, optionally, projected representation of this matrix.

__init__(name: str, input_sequence: Union[neuralmonkey.model.stateful.TemporalStateful, neuralmonkey.model.stateful.SpatialStateful], hidden_size: int, num_heads: int, output_size: int = None, state_proj_size: int = None, dropout_keep_prob: float = 1.0, reuse: neuralmonkey.model.model_part.ModelPart = None, save_checkpoint: str = None, load_checkpoint: str = None, initializers: List[Tuple[str, Callable]] = None) → None

Initialize an instance of the encoder.

attention_weights
output

Return the object output.

A 2D Tensor of shape (batch, state_size) which contains the resulting state of the object.

temporal_mask

Return mask for the temporal_states.

A 2D Tensor of shape (batch, time) of type float32 which masks the temporal states so each sequence can have a different length. It should only contain ones or zeros.

temporal_states

Return object states in time.

A 3D Tensor of shape (batch, time, state_size) which contains the states of the object in time (e.g. hidden states of a recurrent encoder.