-
Notifications
You must be signed in to change notification settings - Fork 130
Old: Theano Attention Parameters
This is for the Theano backend. The settings are not compatible with the TensorFlow backend. The settings are meant for the rec-layer.
Mandatory parameters
- base: The layer that attention mechanism uses as its base. If you don't specify this parameter, encoder is taken as the base.
- recurrent_transform: If this parameter is set to "attention_list", then attention is enabled for the layer.
Additional parameters
-
attention_template
Size of the template vector for attention.
Default: 128 -
attention_distance
Different types of possible distance functions to create the energy vector from previous state of the decoder and the final state of the encoder.
Possible values:- "l2" : Euclidean distance
- "sqr" : Squared distance
- "dot" : Dot product
- "l1" : L1 norm
- "cos" : Cosine similarity
- "rnn" : Exponential Linear Units [https://arxiv.org/pdf/1511.07289v1.pdf]
Default : "l2"
-
attention_norm
Different types of possible normalizations that can be done to obtain the alpha weights for attention.
Possible values:- "exp" : Exponential normalization
- "sigmoid" : Sigmoid normalization
- "lstm : Normalization with an LSTM
Default : "exp"
-
attention_sharpening
Degree by which you would like to sharpen/scale your attention weights. Default: 1.0 -
attention_nbest
Selects n highest alpha weights to enable attending to the corresponding states alone instead of attending to the entire sequence. -
attention_glimpse
Number of glimpses into previous decoder states.
Default : 1