Language Modeling¶

model_word_ada.adaptive module¶

class model_word_ada.adaptive.AdaptiveSoftmax(input_size, cutoff)[source]¶

The adaptive softmax layer. Modified from: https://github.com/rosinality/adaptive-softmax-pytorch/blob/master/adasoft.py

Parameters:	input_size (`int`, required.) – The input dimension. cutoff (`list`, required.) – The list of cutoff values.

forward(w_in, target)[source]¶

Calculate the log-likihood w.o. calculate the full distribution.

Parameters:	w_in (`torch.FloatTensor`, required.) – the input tensor, of shape (word_num, input_dim). target (`torch.FloatTensor`, required.) – the target of the language model, of shape (word_num).
Returns:	loss – The NLL loss.
Return type:	`torch.FloatTensor`.

log_prob(w_in, device)[source]¶

Calculate log-probability for the whole dictionary.

Parameters:	w_in (`torch.FloatTensor`, required.) – the input tensor, of shape (word_num, input_dim). device (`torch.device`, required.) – the target device for calculation.
Returns:	prob – The full log-probability.
Return type:	`torch.FloatTensor`.

rand_ini()[source]¶: Random Initialization.

model_word_ada.basic module¶

class model_word_ada.basic.BasicRNN(layer_num, unit, emb_dim, hid_dim, droprate)[source]¶

The multi-layer recurrent networks for the vanilla stacked RNNs.

Parameters:	layer_num (`int`, required.) – The number of layers. unit (`torch.nn.Module`, required.) – The type of rnn unit. input_dim (`int`, required.) – The input dimension fo the unit. hid_dim (`int`, required.) – The hidden dimension fo the unit. droprate (`float`, required.) – The dropout ratrio.

forward(x)[source]¶

Calculate the output.

Parameters:	x (`torch.LongTensor`, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).
Returns:	output – The output of RNNs.
Return type:	`torch.FloatTensor`.

init_hidden()[source]¶: Initialize hidden states.

rand_ini()[source]¶: Random Initialization.

to_params()[source]¶: To parameters.

class model_word_ada.basic.BasicUnit(unit, input_dim, hid_dim, droprate)[source]¶

The basic recurrent unit for the vanilla stacked RNNs.

Parameters:	unit (`str`, required.) – The type of rnn unit. input_dim (`int`, required.) – The input dimension fo the unit. hid_dim (`int`, required.) – The hidden dimension fo the unit. droprate (`float`, required.) – The dropout ratrio.

forward(x)[source]¶

Calculate the output.

Parameters:	x (`torch.LongTensor`, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).
Returns:	output – The output of RNNs.
Return type:	`torch.FloatTensor`.

init_hidden()[source]¶: Initialize hidden states.

rand_ini()[source]¶: Random Initialization.

model_word_ada.dataset module¶

class model_word_ada.dataset.EvalDataset(dataset, sequence_length)[source]¶

Dataset for Language Modeling

Parameters:	dataset (`list`, required.) – The encoded dataset (outputs of preprocess scripts). sequence_length (`int`, required.) – Sequence Length.

construct_index()[source]¶: construct index for the dataset.

get_tqdm(device)[source]¶

construct dataset reader and the corresponding tqdm.

Parameters:	device (`torch.device`, required.) – the target device for the dataset loader.

reader(device)[source]¶

construct dataset reader.

Parameters:	device (`torch.device`, required.) – the target device for the dataset loader.
Returns:	reader – A lazy iterable object
Return type:	`iterator`.

class model_word_ada.dataset.LargeDataset(root, range_idx, batch_size, sequence_length)[source]¶

Lazy Dataset for Language Modeling

Parameters:	root (`str`, required.) – The root folder for dataset files. range_idx (`int`, required.) – The maximum file index for the input files (train_.pk). batch_size* (`int`, required.) – Batch size. sequence_length (`int`, required.) – Sequence Length.

get_tqdm(device)[source]¶

construct dataset reader and the corresponding tqdm.

Parameters:	device (`torch.device`, required.) – the target device for the dataset loader.

open_next()[source]¶: Open the next file.

reader(device)[source]¶

construct dataset reader.

Parameters:	device (`torch.device`, required.) – the target device for the dataset loader.
Returns:	reader – A lazy iterable object
Return type:	`iterator`.

shuffle()[source]¶: shuffle dataset

model_word_ada.densenet module¶

class model_word_ada.densenet.BasicUnit(unit, input_dim, increase_rate, droprate)[source]¶

The basic recurrent unit for the densely connected RNNs.

Parameters:	unit (`torch.nn.Module`, required.) – The type of rnn unit. input_dim (`float`, required.) – The input dimension fo the unit. increase_rate (`float`, required.) – The hidden dimension fo the unit. droprate (`float`, required.) – The dropout ratrio.

forward(x)[source]¶

Calculate the output.

Parameters:	x (`torch.LongTensor`, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).
Returns:	output – The output of RNNs.
Return type:	`torch.FloatTensor`.

init_hidden()[source]¶: Initialize hidden states.

rand_ini()[source]¶: Random Initialization.

class model_word_ada.densenet.DenseRNN(layer_num, unit, emb_dim, hid_dim, droprate)[source]¶

The multi-layer recurrent networks for the densely connected RNNs.

Parameters:	layer_num (`float`, required.) – The number of layers. unit (`torch.nn.Module`, required.) – The type of rnn unit. input_dim (`float`, required.) – The input dimension fo the unit. hid_dim (`float`, required.) – The hidden dimension fo the unit. droprate (`float`, required.) – The dropout ratrio.

forward(x)[source]¶

Calculate the output.

Parameters:	x (`torch.LongTensor`, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).
Returns:	output – The output of RNNs.
Return type:	`torch.FloatTensor`.

init_hidden()[source]¶: Initialize hidden states.

rand_ini()[source]¶: Random Initialization.

to_params()[source]¶: To parameters.

model_word_ada.ldnet module¶

class model_word_ada.ldnet.BasicUnit(unit, input_dim, increase_rate, droprate, layer_drop=0)[source]¶

The basic recurrent unit for the densely connected RNNs with layer-wise dropout.

Parameters:	unit (`torch.nn.Module`, required.) – The type of rnn unit. input_dim (`float`, required.) – The input dimension fo the unit. increase_rate (`float`, required.) – The hidden dimension fo the unit. droprate (`float`, required.) – The dropout ratrio. layer_dropout (`float`, required.) – The layer-wise dropout ratrio.

forward(x, p_out)[source]¶

Calculate the output.

Parameters:

x (torch.LongTensor, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).
p_out (torch.LongTensor, required.) – the final output tensor for the softmax, of shape (seq_len, batch_size, input_dim).

Returns:

out (torch.FloatTensor.) – The undropped outputs of RNNs to the softmax.
p_out (torch.FloatTensor.) – The dropped outputs of RNNs to the next_layer.

init_hidden()[source]¶: Initialize hidden states.

rand_ini()[source]¶: Random Initialization.

class model_word_ada.ldnet.LDRNN(layer_num, unit, emb_dim, hid_dim, droprate, layer_drop)[source]¶

The multi-layer recurrent networks for the densely connected RNNs with layer-wise dropout.

Parameters:

layer_num (float, required.) – The number of layers.
unit (torch.nn.Module, required.) – The type of rnn unit.
input_dim (float, required.) – The input dimension fo the unit.
hid_dim (float, required.) – The hidden dimension fo the unit.
droprate (float, required.) – The dropout ratrio.
layer_dropout (float, required.) – The layer-wise dropout ratrio.

forward(x)[source]¶

Calculate the output.

Parameters:	x (`torch.LongTensor`, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).
Returns:	output – The output of RNNs to the Softmax.
Return type:	`torch.FloatTensor`.

init_hidden()[source]¶: Initialize hidden states.

rand_ini()[source]¶: Random Initialization.

to_params()[source]¶: To parameters.

model_word_ada.LM module¶

class model_word_ada.LM.LM(rnn, soft_max, w_num, w_dim, droprate, label_dim=-1, add_relu=False)[source]¶

The language model model.

Parameters:	rnn (`torch.nn.Module`, required.) – The RNNs network. soft_max (`torch.nn.Module`, required.) – The softmax layer. w_num (`int` , required.) – The number of words. w_dim (`int` , required.) – The dimension of word embedding. droprate (`float` , required) – The dropout ratio. label_dim (`int` , required.) – The input dimension of softmax.

forward(w_in, target)[source]¶

Calculate the loss.

Parameters:	w_in (`torch.FloatTensor`, required.) – the input tensor, of shape (word_num, input_dim). target (`torch.FloatTensor`, required.) – the target of the language model, of shape (word_num).
Returns:	loss – The NLL loss.
Return type:	`torch.FloatTensor`.

init_hidden()[source]¶: Initialize hidden states.

load_embed(origin_lm)[source]¶: Load embedding from another language model.

log_prob(w_in)[source]¶

Calculate log-probability for the whole dictionary.

Parameters:	w_in (`torch.FloatTensor`, required.) – the input tensor, of shape (word_num, input_dim).
Returns:	prob – The full log-probability.
Return type:	`torch.FloatTensor`.

rand_ini()[source]¶: Random initialization.

model_word_ada.utils module¶

model_word_ada.utils.adjust_learning_rate(optimizer, lr)[source]¶

adjust learning to the the new value.

Parameters:	optimizer (required.) – pytorch optimizer. float (`float`, required.) – the target learning rate.

model_word_ada.utils.init_embedding(input_embedding)[source]¶: random initialize embedding

model_word_ada.utils.init_linear(input_linear)[source]¶: random initialize linear projection.

model_word_ada.utils.init_lstm(input_lstm)[source]¶: random initialize lstms

model_word_ada.utils.repackage_hidden(h)[source]¶

Wraps hidden states in new Variables, to detach them from their history

Parameters:	h (`Tuple` or `Tensors`, required.) – Tuple or Tensors, hidden states.
Returns:	hidden – detached hidden states
Return type:	`Tuple` or `Tensors`.

model_word_ada.utils.to_scalar(var)[source]¶: convert a tensor to a scalar number