neuralmonkey.nn package


neuralmonkey.nn.highway module

Module implementing the highway networks.

neuralmonkey.nn.highway.highway(inputs, activation=<function relu>, scope='HighwayNetwork')

Create a single highway layer.

y = H(x, Wh) * T(x, Wt) + x * C(x, Wc)


C(x, Wc) = 1 - T(x, Wt)

  • inputs – A tensor or list of tensors. It should be 2D tensors with equal length in the first dimension (batch size)
  • activation – Activation function of the linear part of the formula H(x, Wh).
  • scope – The name of the scope used for the variables.

A tensor of shape tf.shape(inputs)

neuralmonkey.nn.mlp module

class neuralmonkey.nn.mlp.MultilayerPerceptron(mlp_input: tensorflow.python.framework.ops.Tensor, layer_configuration: typing.List[int], dropout_keep_prob: float, output_size: int, train_mode: tensorflow.python.framework.ops.Tensor, activation_fn: typing.Callable[[tensorflow.python.framework.ops.Tensor], tensorflow.python.framework.ops.Tensor] = <function relu>, name: str = 'multilayer_perceptron') → None

Bases: object

General implementation of the multilayer perceptron.


neuralmonkey.nn.noisy_gru_cell module

class neuralmonkey.nn.noisy_gru_cell.NoisyGRUCell(num_units: int, training) → None

Bases: tensorflow.python.ops.rnn_cell_impl.RNNCell

Gated Recurrent Unit cell (cf.

GRU with noisy activation functions ( The theano code is availble at

It is based on the TensorFlow implementatin of GRU just the activation function are changed for the noisy ones.

neuralmonkey.nn.noisy_gru_cell.noisy_activation(x, generic, linearized, training, alpha: float = 1.1, c: float = 0.5)

Apply the noisy activation.

Implements the noisy activation with Half-Normal Noise for Hard-Saturation functions.

See, Algorithm 1.

  • x – Tensor which is an input to the activation function
  • generic – The generic formulation of the activation function. (denoted as h in the paper)
  • linearized – Linearization of the activation based on the first-order Tailor expansion around zero. (denoted as u in the paper)
  • training – A boolean tensor telling whether we are in the training stage (and the noise is sampled) or in runtime when the expactation is used instead.
  • alpha – Mixing hyper-parameter. The leakage rate from the linearized function to the nonlinear one.
  • c – Standard deviation of the sampled noise.
neuralmonkey.nn.noisy_gru_cell.noisy_sigmoid(x, training)
neuralmonkey.nn.noisy_gru_cell.noisy_tanh(x, training)

neuralmonkey.nn.ortho_gru_cell module

class neuralmonkey.nn.ortho_gru_cell.NematusGRUCell(rnn_size, use_state_bias=False, use_input_bias=True)

Bases: tensorflow.python.ops.rnn_cell_impl.GRUCell

Nematus implementation of gated recurrent unit cell.

The main difference is the order in which the gating functions and linear projections are applied to the hidden state.

The math is equivalent, in practice there are differences due to float precision errors.

call(inputs, state)

Gated recurrent unit (GRU) with nunits cells.

class neuralmonkey.nn.ortho_gru_cell.OrthoGRUCell(num_units, activation=None, reuse=None, bias_initializer=None)

Bases: tensorflow.python.ops.rnn_cell_impl.GRUCell

Classic GRU cell but initialized using random orthogonal matrices.

neuralmonkey.nn.pervasive_dropout_wrapper module

class neuralmonkey.nn.pervasive_dropout_wrapper.PervasiveDropoutWrapper(cell, mask, scale) → None

Bases: tensorflow.python.ops.rnn_cell_impl.RNNCell


neuralmonkey.nn.projection module

Module which implements various types of projections.

neuralmonkey.nn.projection.glu(input_: tensorflow.python.framework.ops.Tensor, gating_fn: typing.Callable[[tensorflow.python.framework.ops.Tensor], tensorflow.python.framework.ops.Tensor] = <function sigmoid>) → tensorflow.python.framework.ops.Tensor

Apply a Gated Linear Unit.

Gated Linear Unit - Dauphin et al. (2016).

neuralmonkey.nn.projection.maxout(inputs: tensorflow.python.framework.ops.Tensor, size: int, scope: str = 'MaxoutProjection') → tensorflow.python.framework.ops.Tensor

Apply a maxout operation.

Implementation of Maxout layer (Goodfellow et al., 2013).

z = Wx + b y_i = max(z_{2i-1}, z_{2i})

  • inputs – A tensor or list of tensors. It should be 2D tensors with equal length in the first dimension (batch size)
  • size – The size of dimension 1 of the output tensor.
  • scope – The name of the scope used for the variables

A tensor of shape batch x size

neuralmonkey.nn.projection.multilayer_projection(input_: tensorflow.python.framework.ops.Tensor, layer_sizes: typing.List[int], train_mode: tensorflow.python.framework.ops.Tensor, activation: typing.Callable[[tensorflow.python.framework.ops.Tensor], tensorflow.python.framework.ops.Tensor] = <function relu>, dropout_keep_prob: float = 1.0, scope: str = 'mlp') → tensorflow.python.framework.ops.Tensor

neuralmonkey.nn.utils module

Module which provides utility functions used across the package.

neuralmonkey.nn.utils.dropout(variable: tensorflow.python.framework.ops.Tensor, keep_prob: float, train_mode: tensorflow.python.framework.ops.Tensor) → tensorflow.python.framework.ops.Tensor

Perform dropout on a variable, depending on mode.

  • variable – The variable to be dropped out
  • keep_prob – The probability of keeping a value in the variable
  • train_mode – A bool Tensor specifying whether to dropout or not

Module contents