Language Modeling

model_word_ada.adaptive module

class model_word_ada.adaptive.AdaptiveSoftmax(input_size, cutoff)[source]

The adaptive softmax layer. Modified from: https://github.com/rosinality/adaptive-softmax-pytorch/blob/master/adasoft.py

Parameters:
  • input_size (int, required.) – The input dimension.
  • cutoff (list, required.) – The list of cutoff values.
forward(w_in, target)[source]

Calculate the log-likihood w.o. calculate the full distribution.

Parameters:
  • w_in (torch.FloatTensor, required.) – the input tensor, of shape (word_num, input_dim).
  • target (torch.FloatTensor, required.) – the target of the language model, of shape (word_num).
Returns:

loss – The NLL loss.

Return type:

torch.FloatTensor.

log_prob(w_in, device)[source]

Calculate log-probability for the whole dictionary.

Parameters:
  • w_in (torch.FloatTensor, required.) – the input tensor, of shape (word_num, input_dim).
  • device (torch.device, required.) – the target device for calculation.
Returns:

prob – The full log-probability.

Return type:

torch.FloatTensor.

rand_ini()[source]

Random Initialization.

model_word_ada.basic module

class model_word_ada.basic.BasicRNN(layer_num, unit, emb_dim, hid_dim, droprate)[source]

The multi-layer recurrent networks for the vanilla stacked RNNs.

Parameters:
  • layer_num (int, required.) – The number of layers.
  • unit (torch.nn.Module, required.) – The type of rnn unit.
  • input_dim (int, required.) – The input dimension fo the unit.
  • hid_dim (int, required.) – The hidden dimension fo the unit.
  • droprate (float, required.) – The dropout ratrio.
forward(x)[source]

Calculate the output.

Parameters:x (torch.LongTensor, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).
Returns:output – The output of RNNs.
Return type:torch.FloatTensor.
init_hidden()[source]

Initialize hidden states.

rand_ini()[source]

Random Initialization.

to_params()[source]

To parameters.

class model_word_ada.basic.BasicUnit(unit, input_dim, hid_dim, droprate)[source]

The basic recurrent unit for the vanilla stacked RNNs.

Parameters:
  • unit (str, required.) – The type of rnn unit.
  • input_dim (int, required.) – The input dimension fo the unit.
  • hid_dim (int, required.) – The hidden dimension fo the unit.
  • droprate (float, required.) – The dropout ratrio.
forward(x)[source]

Calculate the output.

Parameters:x (torch.LongTensor, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).
Returns:output – The output of RNNs.
Return type:torch.FloatTensor.
init_hidden()[source]

Initialize hidden states.

rand_ini()[source]

Random Initialization.

model_word_ada.dataset module

class model_word_ada.dataset.EvalDataset(dataset, sequence_length)[source]

Dataset for Language Modeling

Parameters:
  • dataset (list, required.) – The encoded dataset (outputs of preprocess scripts).
  • sequence_length (int, required.) – Sequence Length.
construct_index()[source]

construct index for the dataset.

get_tqdm(device)[source]

construct dataset reader and the corresponding tqdm.

Parameters:device (torch.device, required.) – the target device for the dataset loader.
reader(device)[source]

construct dataset reader.

Parameters:device (torch.device, required.) – the target device for the dataset loader.
Returns:reader – A lazy iterable object
Return type:iterator.
class model_word_ada.dataset.LargeDataset(root, range_idx, batch_size, sequence_length)[source]

Lazy Dataset for Language Modeling

Parameters:
  • root (str, required.) – The root folder for dataset files.
  • range_idx (int, required.) – The maximum file index for the input files (train_*.pk).
  • batch_size (int, required.) – Batch size.
  • sequence_length (int, required.) – Sequence Length.
get_tqdm(device)[source]

construct dataset reader and the corresponding tqdm.

Parameters:device (torch.device, required.) – the target device for the dataset loader.
open_next()[source]

Open the next file.

reader(device)[source]

construct dataset reader.

Parameters:device (torch.device, required.) – the target device for the dataset loader.
Returns:reader – A lazy iterable object
Return type:iterator.
shuffle()[source]

shuffle dataset

model_word_ada.densenet module

class model_word_ada.densenet.BasicUnit(unit, input_dim, increase_rate, droprate)[source]

The basic recurrent unit for the densely connected RNNs.

Parameters:
  • unit (torch.nn.Module, required.) – The type of rnn unit.
  • input_dim (float, required.) – The input dimension fo the unit.
  • increase_rate (float, required.) – The hidden dimension fo the unit.
  • droprate (float, required.) – The dropout ratrio.
forward(x)[source]

Calculate the output.

Parameters:x (torch.LongTensor, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).
Returns:output – The output of RNNs.
Return type:torch.FloatTensor.
init_hidden()[source]

Initialize hidden states.

rand_ini()[source]

Random Initialization.

class model_word_ada.densenet.DenseRNN(layer_num, unit, emb_dim, hid_dim, droprate)[source]

The multi-layer recurrent networks for the densely connected RNNs.

Parameters:
  • layer_num (float, required.) – The number of layers.
  • unit (torch.nn.Module, required.) – The type of rnn unit.
  • input_dim (float, required.) – The input dimension fo the unit.
  • hid_dim (float, required.) – The hidden dimension fo the unit.
  • droprate (float, required.) – The dropout ratrio.
forward(x)[source]

Calculate the output.

Parameters:x (torch.LongTensor, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).
Returns:output – The output of RNNs.
Return type:torch.FloatTensor.
init_hidden()[source]

Initialize hidden states.

rand_ini()[source]

Random Initialization.

to_params()[source]

To parameters.

model_word_ada.ldnet module

class model_word_ada.ldnet.BasicUnit(unit, input_dim, increase_rate, droprate, layer_drop=0)[source]

The basic recurrent unit for the densely connected RNNs with layer-wise dropout.

Parameters:
  • unit (torch.nn.Module, required.) – The type of rnn unit.
  • input_dim (float, required.) – The input dimension fo the unit.
  • increase_rate (float, required.) – The hidden dimension fo the unit.
  • droprate (float, required.) – The dropout ratrio.
  • layer_dropout (float, required.) – The layer-wise dropout ratrio.
forward(x, p_out)[source]

Calculate the output.

Parameters:
  • x (torch.LongTensor, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).
  • p_out (torch.LongTensor, required.) – the final output tensor for the softmax, of shape (seq_len, batch_size, input_dim).
Returns:

  • out (torch.FloatTensor.) – The undropped outputs of RNNs to the softmax.
  • p_out (torch.FloatTensor.) – The dropped outputs of RNNs to the next_layer.

init_hidden()[source]

Initialize hidden states.

rand_ini()[source]

Random Initialization.

class model_word_ada.ldnet.LDRNN(layer_num, unit, emb_dim, hid_dim, droprate, layer_drop)[source]

The multi-layer recurrent networks for the densely connected RNNs with layer-wise dropout.

Parameters:
  • layer_num (float, required.) – The number of layers.
  • unit (torch.nn.Module, required.) – The type of rnn unit.
  • input_dim (float, required.) – The input dimension fo the unit.
  • hid_dim (float, required.) – The hidden dimension fo the unit.
  • droprate (float, required.) – The dropout ratrio.
  • layer_dropout (float, required.) – The layer-wise dropout ratrio.
forward(x)[source]

Calculate the output.

Parameters:x (torch.LongTensor, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).
Returns:output – The output of RNNs to the Softmax.
Return type:torch.FloatTensor.
init_hidden()[source]

Initialize hidden states.

rand_ini()[source]

Random Initialization.

to_params()[source]

To parameters.

model_word_ada.LM module

class model_word_ada.LM.LM(rnn, soft_max, w_num, w_dim, droprate, label_dim=-1, add_relu=False)[source]

The language model model.

Parameters:
  • rnn (torch.nn.Module, required.) – The RNNs network.
  • soft_max (torch.nn.Module, required.) – The softmax layer.
  • w_num (int , required.) – The number of words.
  • w_dim (int , required.) – The dimension of word embedding.
  • droprate (float , required) – The dropout ratio.
  • label_dim (int , required.) – The input dimension of softmax.
forward(w_in, target)[source]

Calculate the loss.

Parameters:
  • w_in (torch.FloatTensor, required.) – the input tensor, of shape (word_num, input_dim).
  • target (torch.FloatTensor, required.) – the target of the language model, of shape (word_num).
Returns:

loss – The NLL loss.

Return type:

torch.FloatTensor.

init_hidden()[source]

Initialize hidden states.

load_embed(origin_lm)[source]

Load embedding from another language model.

log_prob(w_in)[source]

Calculate log-probability for the whole dictionary.

Parameters:w_in (torch.FloatTensor, required.) – the input tensor, of shape (word_num, input_dim).
Returns:prob – The full log-probability.
Return type:torch.FloatTensor.
rand_ini()[source]

Random initialization.

model_word_ada.utils module

model_word_ada.utils.adjust_learning_rate(optimizer, lr)[source]

adjust learning to the the new value.

Parameters:
  • optimizer (required.) – pytorch optimizer.
  • float (float, required.) – the target learning rate.
model_word_ada.utils.init_embedding(input_embedding)[source]

random initialize embedding

model_word_ada.utils.init_linear(input_linear)[source]

random initialize linear projection.

model_word_ada.utils.init_lstm(input_lstm)[source]

random initialize lstms

model_word_ada.utils.repackage_hidden(h)[source]

Wraps hidden states in new Variables, to detach them from their history

Parameters:h (Tuple or Tensors, required.) – Tuple or Tensors, hidden states.
Returns:hidden – detached hidden states
Return type:Tuple or Tensors.
model_word_ada.utils.to_scalar(var)[source]

convert a tensor to a scalar number