neuralmonkey.decoders.autoregressive module

Abstract class for autoregressive decoding.

Either for the recurrent decoder, or for the transformer decoder.

The autoregressive decoder uses the while loop to get the outputs. Descendants should only specify the initial state and the while loop body.

class neuralmonkey.decoders.autoregressive.AutoregressiveDecoder(name: str, vocabulary: neuralmonkey.vocabulary.Vocabulary, data_id: str, max_output_len: int, dropout_keep_prob: float = 1.0, embedding_size: int = None, embeddings_source: neuralmonkey.model.sequence.EmbeddedSequence = None, tie_embeddings: bool = False, label_smoothing: float = None, supress_unk: bool = False, reuse: neuralmonkey.model.model_part.ModelPart = None, save_checkpoint: str = None, load_checkpoint: str = None, initializers: List[Tuple[str, Callable]] = None) → None

Bases: neuralmonkey.model.model_part.ModelPart

__init__(name: str, vocabulary: neuralmonkey.vocabulary.Vocabulary, data_id: str, max_output_len: int, dropout_keep_prob: float = 1.0, embedding_size: int = None, embeddings_source: neuralmonkey.model.sequence.EmbeddedSequence = None, tie_embeddings: bool = False, label_smoothing: float = None, supress_unk: bool = False, reuse: neuralmonkey.model.model_part.ModelPart = None, save_checkpoint: str = None, load_checkpoint: str = None, initializers: List[Tuple[str, Callable]] = None) → None

Initialize parameters common for all autoregressive decoders.

Parameters:
  • name – Name of the decoder. Should be unique accross all Neural Monkey objects.
  • vocabulary – Target vocabulary.
  • data_id – Target data series.
  • max_output_len – Maximum length of an output sequence.
  • reuse – Reuse the variables from the model part.
  • dropout_keep_prob – Probability of keeping a value during dropout.
  • embedding_size – Size of embedding vectors for target words.
  • embeddings_source – Embedded sequence to take embeddings from.
  • tie_embeddings – Use decoder.embedding_matrix also in place of the output decoding matrix.
  • label_smoothing – Label smoothing parameter.
  • supress_unk – If true, decoder will not produce symbols for unknown tokens.
cost
decoded
decoding_b
decoding_loop(train_mode: bool, sample: bool = False, temperature: float = 1) → Tuple[tensorflow.python.framework.ops.Tensor, tensorflow.python.framework.ops.Tensor, tensorflow.python.framework.ops.Tensor, tensorflow.python.framework.ops.Tensor]

Run the decoding while loop.

Calls get_initial_loop_state and constructs tf.while_loop with the continuation criterion returned from loop_continue_criterion, and body function returned from get_body.

After finishing the tf.while_loop, it calls finalize_loop to further postprocess the final decoder loop state (usually by stacking Tensors containing decoding histories).

Parameters:
  • train_mode – Boolean flag, telling whether this is a training run.
  • sample – Boolean flag, telling whether we should sample the output symbols from the output distribution instead of using argmax or gold data.
  • temperature – float value specifying the softmax temperature
decoding_w
embedding_matrix

Variables and operations for embedding of input words.

If we are reusing word embeddings, this function takes the embedding matrix from the first encoder

embedding_size
feed_dict(dataset: neuralmonkey.dataset.Dataset, train: bool = False) → Dict[tensorflow.python.framework.ops.Tensor, Any]

Populate the feed dictionary for the decoder object.

Parameters:
  • dataset – The dataset to use for the decoder.
  • train – Boolean flag, telling whether this is a training run.
finalize_loop(final_loop_state: neuralmonkey.decoders.autoregressive.LoopState, train_mode: bool) → None

Execute post-while loop operations.

Parameters:
  • final_loop_state – Decoder loop state at the end of the decoding loop.
  • train_mode – Boolean flag, telling whether this is a training run.
get_body(train_mode: bool, sample: bool = False, temperature: float = 1) → Callable

Return the while loop body function.

get_initial_loop_state() → neuralmonkey.decoders.autoregressive.LoopState
get_logits(state: tensorflow.python.framework.ops.Tensor) → tensorflow.python.framework.ops.Tensor

Project the decoder’s output layer to logits over the vocabulary.

go_symbols
input_shapes
input_types
loop_continue_criterion(*args) → tensorflow.python.framework.ops.Tensor

Decide whether to break out of the while loop.

Parameters:loop_stateLoopState instance (see the docs for this module). Represents current decoder loop state.
output_dimension
runtime_logits
runtime_logprobs
runtime_loop_result
runtime_loss
runtime_mask
runtime_output_states
runtime_xents
train_inputs
train_logits
train_logprobs
train_loop_result
train_loss
train_mask
train_output_states
train_xents
class neuralmonkey.decoders.autoregressive.DecoderConstants

Bases: neuralmonkey.decoders.autoregressive.DecoderConstants

The constants used by an autoregressive decoder.

train_inputs

During training, this is populated by the target token ids.

class neuralmonkey.decoders.autoregressive.DecoderFeedables

Bases: neuralmonkey.decoders.autoregressive.DecoderFeedables

The input of a single step of an autoregressive decoder.

step

A scalar int tensor, stores the number of the current time step.

finished

A boolean tensor of shape (batch), which says whether the decoding of a sentence in the batch is finished or not. (E.g. whether the end token has already been generated.)

input_symbol

A boolean batch-sized tensor with the inputs to the decoder. During inference, this contains the previously generated tokens. During training, this contains the reference tokens.

prev_logits

A tensor of shape (batch, vocabulary). Contains the logits from the previous decoding step.

class neuralmonkey.decoders.autoregressive.DecoderHistories

Bases: neuralmonkey.decoders.autoregressive.DecoderHistories

The values collected during the run of an autoregressive decoder.

logits

A tensor of shape (time, batch, vocabulary) which contains the unnormalized output scores of words in a vocabulary.

decoder_outputs

A tensor of shape (time, batch, state_size). The states of the decoder before the final output (logit) projection.

outputs

An int tensor of shape (time, batch). Stores the generated symbols. (Either an argmax-ed value from the logits, or a target token, during training.)

mask

A float tensor of zeros and ones of shape (time, batch). Keeps track of valid positions in the decoded data.

class neuralmonkey.decoders.autoregressive.LoopState

Bases: neuralmonkey.decoders.autoregressive.LoopState

The loop state object.

The LoopState is a structure that works with the tf.while_loop function the decoder loop state stores all the information that is not invariant for the decoder run.

histories

A set of tensors that grow in time as the decoder proceeds.

constants

A set of independent tensors that do not change during the entire decoder run.

feedables

A set of tensors used as the input of a single decoder step.