neuralmonkey.encoders.imagenet_encoder module¶

Pre-trained ImageNet networks.

class neuralmonkey.encoders.imagenet_encoder.ImageNet(name: str, data_id: str, network_type: str, slim_models_path: str, load_checkpoint: str = None, spatial_layer: str = None, encoded_layer: str = None, initializers: List[Tuple[str, Callable]] = None) → None¶

Bases: neuralmonkey.model.model_part.ModelPart, neuralmonkey.model.stateful.SpatialStatefulWithOutput

Pre-trained ImageNet network.

We use the ImageNet networks as they are in the tesnorflow/models repository (https://github.com/tensorflow/models). In order use them, you need to clone the repository and configure the ImageNet object such that it has a full path to “research/slim” in the repository. Visit https://github.com/tensorflow/models/tree/master/research/slim for information about checkpoints of the pre-trained models.

__init__(name: str, data_id: str, network_type: str, slim_models_path: str, load_checkpoint: str = None, spatial_layer: str = None, encoded_layer: str = None, initializers: List[Tuple[str, Callable]] = None) → None¶

Initialize pre-trained ImageNet network.

Parameters:

name – Name of the model part (the ImageNet network, will be in its scope, independently on name).
data_id – Id of series with images (list of 3D numpy arrays)
network_type – Identifier of ImageNet network from TFSlim.
spatial_layer – String identifier of the convolutional map (model’s endpoint). Check TFSlim documentation for end point specifications.
encoded_layer – String id of the network layer that will be used as input of a decoder. None means averaging the convolutional maps.
path_to_models – Path to Slim models in tensorflow/models repository.
load_checkpoint – Checkpoint file from which the pre-trained network is loaded.

end_points¶

feed_dict(dataset: neuralmonkey.dataset.Dataset, train: bool = False) → Dict[tensorflow.python.framework.ops.Tensor, Any]¶

Return a feed dictionary for the given feedable object.

Parameters:	dataset – A dataset instance from which to get the data. train – Boolean indicating whether the model runs in training mode.
Returns:	A FeedDict dictionary object.

input_image¶

input_shapes¶

input_types¶

output¶

Return the object output.

A 2D Tensor of shape (batch, state_size) which contains the resulting state of the object.

spatial_mask¶

Return mask for the spatial_states.

A 3D Tensor of shape (batch, width, height) of type float32 which masks the spatial states that they can be of different shapes. The mask should only contain ones or zeros.

spatial_states¶

Return object states in space.

A 4D Tensor of shape (batch, width, height, state_size) which contains the states of the object in space (e.g. final layer of a convolution network processing an image.

class neuralmonkey.encoders.imagenet_encoder.ImageNetSpec¶

Bases: neuralmonkey.encoders.imagenet_encoder.ImageNetSpec

Specification of the Imagenet encoder.

Do not use this object directly, instead, use one of the ``get_*``functions in this module.

scope¶: The variable scope of the network to use.

image_size¶: A tuple of two integers giving the image width and height in pixels.

apply_net¶: The function that receives an image and applies the network.

neuralmonkey.encoders.imagenet_encoder.get_alexnet() → neuralmonkey.encoders.imagenet_encoder.ImageNetSpec¶

neuralmonkey.encoders.imagenet_encoder.get_resnet_by_type(resnet_type: str) → Callable[[], neuralmonkey.encoders.imagenet_encoder.ImageNetSpec]¶

neuralmonkey.encoders.imagenet_encoder.get_vgg_by_type(vgg_type: str) → Callable[[], neuralmonkey.encoders.imagenet_encoder.ImageNetSpec]¶