neuralmonkey.processors.speech module

neuralmonkey.processors.speech.SpeechFeaturesPreprocessor(feature_type: str = 'mfcc', delta_order: int = 0, delta_window: int = 2, **kwargs) → Callable

Calculate speech features.

First, the given type of features (e.g. MFCC) is computed using a window of length winlen and step winstep; for additional keyword arguments (specific to each feature type), see http://python-speech-features.readthedocs.io/. Then, delta features up to delta_order are added.

By default, 13 MFCCs per frame are computed. To add delta and delta-delta features (resulting in 39 coefficients per frame), set delta_order=2.

Parameters:
  • feature_type – mfcc, fbank, logfbank or ssc (default is mfcc)
  • delta_order – maximum order of the delta features (default is 0)
  • delta_window – window size for delta features (default is 2)
  • **kwargs – keyword arguments for the appropriate function from python_speech_features
Returns:

A numpy array of shape [num_frames, num_features].