Language Modeling¶
model_word_ada.adaptive module¶
-
class
model_word_ada.adaptive.
AdaptiveSoftmax
(input_size, cutoff)[source]¶ The adaptive softmax layer. Modified from: https://github.com/rosinality/adaptive-softmax-pytorch/blob/master/adasoft.py
Parameters: - input_size (
int
, required.) – The input dimension. - cutoff (
list
, required.) – The list of cutoff values.
-
forward
(w_in, target)[source]¶ Calculate the log-likihood w.o. calculate the full distribution.
Parameters: - w_in (
torch.FloatTensor
, required.) – the input tensor, of shape (word_num, input_dim). - target (
torch.FloatTensor
, required.) – the target of the language model, of shape (word_num).
Returns: loss – The NLL loss.
Return type: torch.FloatTensor
.- w_in (
-
log_prob
(w_in, device)[source]¶ Calculate log-probability for the whole dictionary.
Parameters: - w_in (
torch.FloatTensor
, required.) – the input tensor, of shape (word_num, input_dim). - device (
torch.device
, required.) – the target device for calculation.
Returns: prob – The full log-probability.
Return type: torch.FloatTensor
.- w_in (
- input_size (
model_word_ada.basic module¶
-
class
model_word_ada.basic.
BasicRNN
(layer_num, unit, emb_dim, hid_dim, droprate)[source]¶ The multi-layer recurrent networks for the vanilla stacked RNNs.
Parameters: - layer_num (
int
, required.) – The number of layers. - unit (
torch.nn.Module
, required.) – The type of rnn unit. - input_dim (
int
, required.) – The input dimension fo the unit. - hid_dim (
int
, required.) – The hidden dimension fo the unit. - droprate (
float
, required.) – The dropout ratrio.
-
forward
(x)[source]¶ Calculate the output.
Parameters: x ( torch.LongTensor
, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).Returns: output – The output of RNNs. Return type: torch.FloatTensor
.
Initialize hidden states.
- layer_num (
-
class
model_word_ada.basic.
BasicUnit
(unit, input_dim, hid_dim, droprate)[source]¶ The basic recurrent unit for the vanilla stacked RNNs.
Parameters: - unit (
str
, required.) – The type of rnn unit. - input_dim (
int
, required.) – The input dimension fo the unit. - hid_dim (
int
, required.) – The hidden dimension fo the unit. - droprate (
float
, required.) – The dropout ratrio.
-
forward
(x)[source]¶ Calculate the output.
Parameters: x ( torch.LongTensor
, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).Returns: output – The output of RNNs. Return type: torch.FloatTensor
.
Initialize hidden states.
- unit (
model_word_ada.dataset module¶
-
class
model_word_ada.dataset.
EvalDataset
(dataset, sequence_length)[source]¶ Dataset for Language Modeling
Parameters: - dataset (
list
, required.) – The encoded dataset (outputs of preprocess scripts). - sequence_length (
int
, required.) – Sequence Length.
- dataset (
-
class
model_word_ada.dataset.
LargeDataset
(root, range_idx, batch_size, sequence_length)[source]¶ Lazy Dataset for Language Modeling
Parameters: - root (
str
, required.) – The root folder for dataset files. - range_idx (
int
, required.) – The maximum file index for the input files (train_*.pk). - batch_size (
int
, required.) – Batch size. - sequence_length (
int
, required.) – Sequence Length.
-
get_tqdm
(device)[source]¶ construct dataset reader and the corresponding tqdm.
Parameters: device ( torch.device
, required.) – the target device for the dataset loader.
- root (
model_word_ada.densenet module¶
-
class
model_word_ada.densenet.
BasicUnit
(unit, input_dim, increase_rate, droprate)[source]¶ The basic recurrent unit for the densely connected RNNs.
Parameters: - unit (
torch.nn.Module
, required.) – The type of rnn unit. - input_dim (
float
, required.) – The input dimension fo the unit. - increase_rate (
float
, required.) – The hidden dimension fo the unit. - droprate (
float
, required.) – The dropout ratrio.
-
forward
(x)[source]¶ Calculate the output.
Parameters: x ( torch.LongTensor
, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).Returns: output – The output of RNNs. Return type: torch.FloatTensor
.
Initialize hidden states.
- unit (
-
class
model_word_ada.densenet.
DenseRNN
(layer_num, unit, emb_dim, hid_dim, droprate)[source]¶ The multi-layer recurrent networks for the densely connected RNNs.
Parameters: - layer_num (
float
, required.) – The number of layers. - unit (
torch.nn.Module
, required.) – The type of rnn unit. - input_dim (
float
, required.) – The input dimension fo the unit. - hid_dim (
float
, required.) – The hidden dimension fo the unit. - droprate (
float
, required.) – The dropout ratrio.
-
forward
(x)[source]¶ Calculate the output.
Parameters: x ( torch.LongTensor
, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).Returns: output – The output of RNNs. Return type: torch.FloatTensor
.
Initialize hidden states.
- layer_num (
model_word_ada.ldnet module¶
-
class
model_word_ada.ldnet.
BasicUnit
(unit, input_dim, increase_rate, droprate, layer_drop=0)[source]¶ The basic recurrent unit for the densely connected RNNs with layer-wise dropout.
Parameters: - unit (
torch.nn.Module
, required.) – The type of rnn unit. - input_dim (
float
, required.) – The input dimension fo the unit. - increase_rate (
float
, required.) – The hidden dimension fo the unit. - droprate (
float
, required.) – The dropout ratrio. - layer_dropout (
float
, required.) – The layer-wise dropout ratrio.
-
forward
(x, p_out)[source]¶ Calculate the output.
Parameters: - x (
torch.LongTensor
, required.) – the input tensor, of shape (seq_len, batch_size, input_dim). - p_out (
torch.LongTensor
, required.) – the final output tensor for the softmax, of shape (seq_len, batch_size, input_dim).
Returns: - out (
torch.FloatTensor
.) – The undropped outputs of RNNs to the softmax. - p_out (
torch.FloatTensor
.) – The dropped outputs of RNNs to the next_layer.
- x (
Initialize hidden states.
- unit (
-
class
model_word_ada.ldnet.
LDRNN
(layer_num, unit, emb_dim, hid_dim, droprate, layer_drop)[source]¶ The multi-layer recurrent networks for the densely connected RNNs with layer-wise dropout.
Parameters: - layer_num (
float
, required.) – The number of layers. - unit (
torch.nn.Module
, required.) – The type of rnn unit. - input_dim (
float
, required.) – The input dimension fo the unit. - hid_dim (
float
, required.) – The hidden dimension fo the unit. - droprate (
float
, required.) – The dropout ratrio. - layer_dropout (
float
, required.) – The layer-wise dropout ratrio.
-
forward
(x)[source]¶ Calculate the output.
Parameters: x ( torch.LongTensor
, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).Returns: output – The output of RNNs to the Softmax. Return type: torch.FloatTensor
.
Initialize hidden states.
- layer_num (
model_word_ada.LM module¶
-
class
model_word_ada.LM.
LM
(rnn, soft_max, w_num, w_dim, droprate, label_dim=-1, add_relu=False)[source]¶ The language model model.
Parameters: - rnn (
torch.nn.Module
, required.) – The RNNs network. - soft_max (
torch.nn.Module
, required.) – The softmax layer. - w_num (
int
, required.) – The number of words. - w_dim (
int
, required.) – The dimension of word embedding. - droprate (
float
, required) – The dropout ratio. - label_dim (
int
, required.) – The input dimension of softmax.
-
forward
(w_in, target)[source]¶ Calculate the loss.
Parameters: - w_in (
torch.FloatTensor
, required.) – the input tensor, of shape (word_num, input_dim). - target (
torch.FloatTensor
, required.) – the target of the language model, of shape (word_num).
Returns: loss – The NLL loss.
Return type: torch.FloatTensor
.- w_in (
Initialize hidden states.
- rnn (
model_word_ada.utils module¶
-
model_word_ada.utils.
adjust_learning_rate
(optimizer, lr)[source]¶ adjust learning to the the new value.
Parameters: - optimizer (required.) – pytorch optimizer.
- float (
float
, required.) – the target learning rate.
Wraps hidden states in new Variables, to detach them from their history
Parameters: h ( Tuple
orTensors
, required.) – Tuple or Tensors, hidden states.Returns: hidden – detached hidden states Return type: Tuple
orTensors
.