neuralmonkey.encoders.imagenet_encoder module¶
Pre-trained ImageNet networks.
-
class
neuralmonkey.encoders.imagenet_encoder.
ImageNet
(name: str, data_id: str, network_type: str, slim_models_path: str, load_checkpoint: str = None, spatial_layer: str = None, encoded_layer: str = None, initializers: List[Tuple[str, Callable]] = None) → None¶ Bases:
neuralmonkey.model.model_part.ModelPart
,neuralmonkey.model.stateful.SpatialStatefulWithOutput
Pre-trained ImageNet network.
We use the ImageNet networks as they are in the tesnorflow/models repository (https://github.com/tensorflow/models). In order use them, you need to clone the repository and configure the ImageNet object such that it has a full path to “research/slim” in the repository. Visit https://github.com/tensorflow/models/tree/master/research/slim for information about checkpoints of the pre-trained models.
-
__init__
(name: str, data_id: str, network_type: str, slim_models_path: str, load_checkpoint: str = None, spatial_layer: str = None, encoded_layer: str = None, initializers: List[Tuple[str, Callable]] = None) → None¶ Initialize pre-trained ImageNet network.
Parameters: - name – Name of the model part (the ImageNet network, will be in its scope, independently on name).
- data_id – Id of series with images (list of 3D numpy arrays)
- network_type – Identifier of ImageNet network from TFSlim.
- spatial_layer – String identifier of the convolutional map (model’s endpoint). Check TFSlim documentation for end point specifications.
- encoded_layer – String id of the network layer that will be used as input of a decoder. None means averaging the convolutional maps.
- path_to_models – Path to Slim models in tensorflow/models repository.
- load_checkpoint – Checkpoint file from which the pre-trained network is loaded.
-
end_points
¶
-
feed_dict
(dataset: neuralmonkey.dataset.Dataset, train: bool = False) → Dict[tensorflow.python.framework.ops.Tensor, Any]¶ Return a feed dictionary for the given feedable object.
Parameters: - dataset – A dataset instance from which to get the data.
- train – Boolean indicating whether the model runs in training mode.
Returns: A FeedDict dictionary object.
-
input_image
¶
-
input_shapes
¶
-
input_types
¶
-
output
¶ Return the object output.
A 2D Tensor of shape (batch, state_size) which contains the resulting state of the object.
-
spatial_mask
¶ Return mask for the spatial_states.
A 3D Tensor of shape (batch, width, height) of type float32 which masks the spatial states that they can be of different shapes. The mask should only contain ones or zeros.
-
spatial_states
¶ Return object states in space.
A 4D Tensor of shape (batch, width, height, state_size) which contains the states of the object in space (e.g. final layer of a convolution network processing an image.
-
-
class
neuralmonkey.encoders.imagenet_encoder.
ImageNetSpec
¶ Bases:
neuralmonkey.encoders.imagenet_encoder.ImageNetSpec
Specification of the Imagenet encoder.
Do not use this object directly, instead, use one of the ``get_*``functions in this module.
-
scope
¶ The variable scope of the network to use.
-
image_size
¶ A tuple of two integers giving the image width and height in pixels.
-
apply_net
¶ The function that receives an image and applies the network.
-
-
neuralmonkey.encoders.imagenet_encoder.
get_alexnet
() → neuralmonkey.encoders.imagenet_encoder.ImageNetSpec¶
-
neuralmonkey.encoders.imagenet_encoder.
get_resnet_by_type
(resnet_type: str) → Callable[[], neuralmonkey.encoders.imagenet_encoder.ImageNetSpec]¶
-
neuralmonkey.encoders.imagenet_encoder.
get_vgg_by_type
(vgg_type: str) → Callable[[], neuralmonkey.encoders.imagenet_encoder.ImageNetSpec]¶