neuralmonkey.decoders.transformer module¶
Implementation of the decoder of the Transformer model.
Described in Vaswani et al. (2017), arxiv.org/abs/1706.03762
-
class
neuralmonkey.decoders.transformer.
TransformerDecoder
(name: str, encoder: Union[neuralmonkey.model.stateful.TemporalStateful, neuralmonkey.model.stateful.SpatialStateful], vocabulary: neuralmonkey.vocabulary.Vocabulary, data_id: str, ff_hidden_size: int, n_heads_self: int, n_heads_enc: int, depth: int, max_output_len: int, dropout_keep_prob: float = 1.0, embedding_size: int = None, embeddings_source: neuralmonkey.model.sequence.EmbeddedSequence = None, tie_embeddings: bool = True, label_smoothing: float = None, attention_dropout_keep_prob: float = 1.0, use_att_transform_bias: bool = False, supress_unk: bool = False, save_checkpoint: str = None, load_checkpoint: str = None) → None¶ Bases:
neuralmonkey.decoders.autoregressive.AutoregressiveDecoder
-
__init__
(name: str, encoder: Union[neuralmonkey.model.stateful.TemporalStateful, neuralmonkey.model.stateful.SpatialStateful], vocabulary: neuralmonkey.vocabulary.Vocabulary, data_id: str, ff_hidden_size: int, n_heads_self: int, n_heads_enc: int, depth: int, max_output_len: int, dropout_keep_prob: float = 1.0, embedding_size: int = None, embeddings_source: neuralmonkey.model.sequence.EmbeddedSequence = None, tie_embeddings: bool = True, label_smoothing: float = None, attention_dropout_keep_prob: float = 1.0, use_att_transform_bias: bool = False, supress_unk: bool = False, save_checkpoint: str = None, load_checkpoint: str = None) → None¶ Create a decoder of the Transformer model.
Described in Vaswani et al. (2017), arxiv.org/abs/1706.03762
Parameters: - encoder – Input encoder of the decoder.
- vocabulary – Target vocabulary.
- data_id – Target data series.
- name – Name of the decoder. Should be unique accross all Neural Monkey objects.
- max_output_len – Maximum length of an output sequence.
- dropout_keep_prob – Probability of keeping a value during dropout.
- embedding_size – Size of embedding vectors for target words.
- embeddings_source – Embedded sequence to take embeddings from.
- tie_embeddings – Use decoder.embedding_matrix also in place of the output decoding matrix.
Keyword Arguments: - ff_hidden_size – Size of the feedforward sublayers.
- n_heads_self – Number of the self-attention heads.
- n_heads_enc – Number of the attention heads over the encoder.
- depth – Number of sublayers.
- label_smoothing – A label smoothing parameter for cross entropy loss computation.
- attention_dropout_keep_prob – Probability of keeping a value during dropout on the attention output.
- supress_unk – If true, decoder will not produce symbols for unknown tokens.
-
embed_inputs
(inputs: tensorflow.python.framework.ops.Tensor) → tensorflow.python.framework.ops.Tensor¶
-
embedded_train_inputs
¶
-
encoder_attention_sublayer
(queries: tensorflow.python.framework.ops.Tensor) → tensorflow.python.framework.ops.Tensor¶ Create the encoder-decoder attention sublayer.
-
feedforward_sublayer
(layer_input: tensorflow.python.framework.ops.Tensor) → tensorflow.python.framework.ops.Tensor¶ Create the feed-forward network sublayer.
-
get_body
(train_mode: bool, sample: bool = False, temperature: float = 1.0) → Callable¶ Return the while loop body function.
-
get_initial_loop_state
() → neuralmonkey.decoders.autoregressive.LoopState¶
-
layer
(level: int, inputs: tensorflow.python.framework.ops.Tensor, mask: tensorflow.python.framework.ops.Tensor) → neuralmonkey.encoders.transformer.TransformerLayer¶
-
output_dimension
¶
-
self_attention_sublayer
(prev_layer: neuralmonkey.encoders.transformer.TransformerLayer) → tensorflow.python.framework.ops.Tensor¶ Create the decoder self-attention sublayer with output mask.
-
train_logits
¶
-
-
class
neuralmonkey.decoders.transformer.
TransformerHistories
¶ Bases:
neuralmonkey.decoders.transformer.TransformerHistories
The loop state histories for the transformer decoder.
Shares attributes with the
DecoderHistories
class. The special attributes are listed below.-
decoded_symbols
¶ A tensor which stores the decoded symbols.
-
input_mask
¶ A float tensor with zeros and ones which marks the valid positions on the input.
-