neuralmonkey.decoders.decoder module¶

class neuralmonkey.decoders.decoder.Decoder(encoders: List[neuralmonkey.model.stateful.Stateful], vocabulary: neuralmonkey.vocabulary.Vocabulary, data_id: str, name: str, max_output_len: int, dropout_keep_prob: float = 1.0, embedding_size: int = None, embeddings_source: neuralmonkey.model.sequence.EmbeddedSequence = None, tie_embeddings: bool = False, label_smoothing: float = None, rnn_size: int = None, output_projection: Union[Tuple[Callable[[tensorflow.python.framework.ops.Tensor, tensorflow.python.framework.ops.Tensor, List[tensorflow.python.framework.ops.Tensor], tensorflow.python.framework.ops.Tensor], tensorflow.python.framework.ops.Tensor], int], Callable[[tensorflow.python.framework.ops.Tensor, tensorflow.python.framework.ops.Tensor, List[tensorflow.python.framework.ops.Tensor], tensorflow.python.framework.ops.Tensor], tensorflow.python.framework.ops.Tensor]] = None, encoder_projection: Callable[[tensorflow.python.framework.ops.Tensor, int, List[neuralmonkey.model.stateful.Stateful]], tensorflow.python.framework.ops.Tensor] = None, attentions: List[neuralmonkey.attention.base_attention.BaseAttention] = None, attention_on_input: bool = False, rnn_cell: str = 'GRU', conditional_gru: bool = False, supress_unk: bool = False, reuse: neuralmonkey.model.model_part.ModelPart = None, save_checkpoint: str = None, load_checkpoint: str = None, initializers: List[Tuple[str, Callable]] = None) → None¶

Bases: neuralmonkey.decoders.autoregressive.AutoregressiveDecoder

A class managing parts of the computation graph used during decoding.

__init__(encoders: List[neuralmonkey.model.stateful.Stateful], vocabulary: neuralmonkey.vocabulary.Vocabulary, data_id: str, name: str, max_output_len: int, dropout_keep_prob: float = 1.0, embedding_size: int = None, embeddings_source: neuralmonkey.model.sequence.EmbeddedSequence = None, tie_embeddings: bool = False, label_smoothing: float = None, rnn_size: int = None, output_projection: Union[Tuple[Callable[[tensorflow.python.framework.ops.Tensor, tensorflow.python.framework.ops.Tensor, List[tensorflow.python.framework.ops.Tensor], tensorflow.python.framework.ops.Tensor], tensorflow.python.framework.ops.Tensor], int], Callable[[tensorflow.python.framework.ops.Tensor, tensorflow.python.framework.ops.Tensor, List[tensorflow.python.framework.ops.Tensor], tensorflow.python.framework.ops.Tensor], tensorflow.python.framework.ops.Tensor]] = None, encoder_projection: Callable[[tensorflow.python.framework.ops.Tensor, int, List[neuralmonkey.model.stateful.Stateful]], tensorflow.python.framework.ops.Tensor] = None, attentions: List[neuralmonkey.attention.base_attention.BaseAttention] = None, attention_on_input: bool = False, rnn_cell: str = 'GRU', conditional_gru: bool = False, supress_unk: bool = False, reuse: neuralmonkey.model.model_part.ModelPart = None, save_checkpoint: str = None, load_checkpoint: str = None, initializers: List[Tuple[str, Callable]] = None) → None¶

Create a refactored version of monster decoder.

Parameters:

encoders – Input encoders of the decoder.
vocabulary – Target vocabulary.
data_id – Target data series.
name – Name of the decoder. Should be unique accross all Neural Monkey objects.
max_output_len – Maximum length of an output sequence.
dropout_keep_prob – Probability of keeping a value during dropout.
embedding_size – Size of embedding vectors for target words.
embeddings_source – Embedded sequence to take embeddings from.
tie_embeddings – Use decoder.embedding_matrix also in place of the output decoding matrix.
rnn_size – Size of the decoder hidden state, if None set according to encoders.
output_projection – How to generate distribution over vocabulary from decoder_outputs.
encoder_projection – How to construct initial state from encoders.
attention – The attention object to use. Optional.
rnn_cell – RNN Cell used by the decoder (GRU or LSTM).
conditional_gru – Flag whether to use the Conditional GRU architecture.
attention_on_input – Flag whether attention from previous decoding step should be combined with the input in the next step.
supress_unk – If true, decoder will not produce symbols for unknown tokens.
reuse – Reuse the model variables from the given model part.

embed_input_symbol(*args) → tensorflow.python.framework.ops.Tensor¶

encoder_projection¶

finalize_loop(final_loop_state: neuralmonkey.decoders.autoregressive.LoopState, train_mode: bool) → None¶

Execute post-while loop operations.

Parameters:	final_loop_state – Decoder loop state at the end of the decoding loop. train_mode – Boolean flag, telling whether this is a training run.

get_body(train_mode: bool, sample: bool = False, temperature: float = 1) → Callable¶: Return the while loop body function.

get_initial_loop_state() → neuralmonkey.decoders.autoregressive.LoopState¶

initial_state¶

Compute initial decoder state.

The part of the computation graph that computes the initial state of the decoder.

input_plus_attention(*args) → tensorflow.python.framework.ops.Tensor¶

Merge input and previous attentions.

Input and previous attentions are merged into a single vector of the size fo embedding.

output_dimension¶

output_projection¶

output_projection_spec¶

rnn_size¶

class neuralmonkey.decoders.decoder.RNNFeedables¶

Bases: neuralmonkey.decoders.decoder.RNNFeedables

The feedables used by an RNN-based decoder.

Shares attributes with the DecoderFeedables class. The special attributes are listed below.

prev_rnn_state¶: The recurrent state from the previous step. A tensor of shape (batch, rnn_size)

prev_rnn_output¶: The output of the recurrent network from the previous step. A tensor of shape (batch, output_size)

prev_contexts¶: A list of context vectors returned from attention mechanisms. Tensors of shape (batch, encoder_state_size) for each attended encoder.

class neuralmonkey.decoders.decoder.RNNHistories¶

Bases: neuralmonkey.decoders.decoder.RNNHistories

The loop state histories for RNN-based decoders.

Shares attributes with the DecoderHistories class. The special attributes are listed below.

attention_histories¶: A list of AttentionLoopState objects (or similar) populated by values from the attention mechanisms used in the decoder.