neuralmonkey.model.sequence module¶
Module which impements the sequence class and a few of its subclasses.
-
class
neuralmonkey.model.sequence.
EmbeddedFactorSequence
(name: str, vocabularies: List[neuralmonkey.vocabulary.Vocabulary], data_ids: List[str], embedding_sizes: List[int], max_length: int = None, add_start_symbol: bool = False, add_end_symbol: bool = False, scale_embeddings_by_depth: bool = False, embeddings_source: Union[neuralmonkey.model.sequence.EmbeddedFactorSequence, NoneType] = None, save_checkpoint: str = None, load_checkpoint: str = None, initializers: List[Tuple[str, Callable]] = None) → None¶ Bases:
neuralmonkey.model.sequence.Sequence
A sequence that stores one or more embedded inputs (factors).
-
__init__
(name: str, vocabularies: List[neuralmonkey.vocabulary.Vocabulary], data_ids: List[str], embedding_sizes: List[int], max_length: int = None, add_start_symbol: bool = False, add_end_symbol: bool = False, scale_embeddings_by_depth: bool = False, embeddings_source: Union[neuralmonkey.model.sequence.EmbeddedFactorSequence, NoneType] = None, save_checkpoint: str = None, load_checkpoint: str = None, initializers: List[Tuple[str, Callable]] = None) → None¶ Construct a new instance of EmbeddedFactorSequence.
Takes three lists of vocabularies, data series IDs, and embedding sizes and construct a Sequence object. The supplied lists must be equal in length and the indices to these lists must correspond to each other
Parameters: - name – The name for the ModelPart object
- vocabularies – A list of Vocabulary objects used for each factor
- data_ids – A list of strings identifying the data series used for each factor
- embedding_sizes – A list of integers specifying the size of the embedding vector for each factor
- max_length – The maximum length of the sequences
- add_start_symbol – Includes <s> in the sequence
- add_end_symbol – Includes </s> in the sequence
- scale_embeddings_by_depth – Set to True for T2T import compatibility
- embeddings_source – EmbeddedSequence from which the embeedings will be reused.
- save_checkpoint – The save_checkpoint parameter for ModelPart
- load_checkpoint – The load_checkpoint parameter for ModelPart
-
embedding_matrices
¶ Return a list of embedding matrices for each factor.
-
feed_dict
(dataset: neuralmonkey.dataset.dataset.Dataset, train: bool = False) → Dict[tensorflow.python.framework.ops.Tensor, Any]¶ Feed the placholders with the data.
Parameters: - dataset – The dataset.
- train – A flag whether the train mode is enabled.
Returns: The constructed feed dictionary that contains the factor data and the mask.
-
tb_embedding_visualization
(logdir: str, prj: <module 'tensorflow.contrib.tensorboard.plugins.projector' from '/home/docs/checkouts/readthedocs.org/user_builds/neural-monkey/envs/0.2.4/lib/python3.5/site-packages/tensorflow/contrib/tensorboard/plugins/projector/__init__.py'>)¶ Link embeddings with vocabulary wordlist.
Used for tensorboard visualization.
Parameters: - logdir – directory where model is stored
- projector – TensorBoard projector for storing linking info.
-
temporal_mask
¶ Return mask for the temporal_states.
A 2D Tensor of shape (batch, time) of type float32 which masks the temporal states so each sequence can have a different length. It should only contain ones or zeros.
-
temporal_states
¶ Return the embedded factors.
A 3D Tensor of shape (batch, time, dimension), where dimension is the sum of the embedding sizes supplied to the constructor.
-
-
class
neuralmonkey.model.sequence.
EmbeddedSequence
(name: str, vocabulary: neuralmonkey.vocabulary.Vocabulary, data_id: str, embedding_size: int, max_length: int = None, add_start_symbol: bool = False, add_end_symbol: bool = False, scale_embeddings_by_depth: bool = False, embeddings_source: Union[neuralmonkey.model.sequence.EmbeddedSequence, NoneType] = None, save_checkpoint: str = None, load_checkpoint: str = None, initializers: List[Tuple[str, Callable]] = None) → None¶ Bases:
neuralmonkey.model.sequence.EmbeddedFactorSequence
A sequence of embedded inputs (for a single factor).
-
__init__
(name: str, vocabulary: neuralmonkey.vocabulary.Vocabulary, data_id: str, embedding_size: int, max_length: int = None, add_start_symbol: bool = False, add_end_symbol: bool = False, scale_embeddings_by_depth: bool = False, embeddings_source: Union[neuralmonkey.model.sequence.EmbeddedSequence, NoneType] = None, save_checkpoint: str = None, load_checkpoint: str = None, initializers: List[Tuple[str, Callable]] = None) → None¶ Construct a new instance of EmbeddedSequence.
Parameters: - name – The name for the ModelPart object
- vocabulary – A Vocabulary object used for the sequence data
- data_id – A string that identifies the data series used for the sequence data
- embedding_sizes – An integer that specifies the size of the embedding vector for the sequence data
- max_length – The maximum length of the sequences
- add_start_symbol – Includes <s> in the sequence
- add_end_symbol – Includes </s> in the sequence
- scale_embeddings_by_depth – Set to True for T2T import compatibility
- embeddings_source – EmbeddedSequence from which the embeedings will be reused.
- save_checkpoint – The save_checkpoint parameter for ModelPart
- load_checkpoint – The load_checkpoint parameter for ModelPart
-
data_id
¶ Return the input data series indentifier.
-
embedding_matrix
¶ Return the embedding matrix for the sequence.
-
inputs
¶ Return a 2D placeholder for the sequence inputs.
-
vocabulary
¶ Return the input vocabulary.
-
-
class
neuralmonkey.model.sequence.
Sequence
(name: str, max_length: int = None, save_checkpoint: str = None, load_checkpoint: str = None, initializers: List[Tuple[str, Callable]] = None) → None¶ Bases:
neuralmonkey.model.model_part.ModelPart
,neuralmonkey.model.stateful.TemporalStateful
Base class for a data sequence.
This abstract class represents a batch of sequences of Tensors of possibly different lengths.
Sequence is essentialy a temporal stateful object whose states and mask are fed, or computed from fed values. It is also a ModelPart, and therefore, it can store variables such as embedding matrices.
-
__init__
(name: str, max_length: int = None, save_checkpoint: str = None, load_checkpoint: str = None, initializers: List[Tuple[str, Callable]] = None) → None¶ Construct a new Sequence object.
Parameters: - name – The name for the ModelPart object
- max_length – Maximum length of sequences in the object (not checked)
- save_checkpoint – The save_checkpoint parameter for ModelPart
- load_checkpoint – The load_checkpoint parameter for ModelPart
-