LD-Net documentation¶
Check Our New NER Toolkit🚀🚀🚀
- Inference:
- LightNER: inference w. models pre-trained / trained w. any following tools, efficiently.
- Training:
- LD-Net: train NER models w. efficient contextualized representations.
- VanillaNER: train vanilla NER models w. pre-trained embedding.
- Distant Training:
- AutoNER: train NER models w.o. line-by-line annotations and get competitive performance.
This project provides high-performance word-level language model, and sequence labeling with contextualized representation. The key feature of this project is the support of langugage model pruning without retraining.
Details about LD-Net can be accessed at: https://arxiv.org/abs/1804.07827.
Language Modeling¶
model_word_ada.adaptive module¶
-
class
model_word_ada.adaptive.
AdaptiveSoftmax
(input_size, cutoff)[source]¶ The adaptive softmax layer. Modified from: https://github.com/rosinality/adaptive-softmax-pytorch/blob/master/adasoft.py
Parameters: - input_size (
int
, required.) – The input dimension. - cutoff (
list
, required.) – The list of cutoff values.
-
forward
(w_in, target)[source]¶ Calculate the log-likihood w.o. calculate the full distribution.
Parameters: - w_in (
torch.FloatTensor
, required.) – the input tensor, of shape (word_num, input_dim). - target (
torch.FloatTensor
, required.) – the target of the language model, of shape (word_num).
Returns: loss – The NLL loss.
Return type: torch.FloatTensor
.- w_in (
-
log_prob
(w_in, device)[source]¶ Calculate log-probability for the whole dictionary.
Parameters: - w_in (
torch.FloatTensor
, required.) – the input tensor, of shape (word_num, input_dim). - device (
torch.device
, required.) – the target device for calculation.
Returns: prob – The full log-probability.
Return type: torch.FloatTensor
.- w_in (
- input_size (
model_word_ada.basic module¶
-
class
model_word_ada.basic.
BasicRNN
(layer_num, unit, emb_dim, hid_dim, droprate)[source]¶ The multi-layer recurrent networks for the vanilla stacked RNNs.
Parameters: - layer_num (
int
, required.) – The number of layers. - unit (
torch.nn.Module
, required.) – The type of rnn unit. - input_dim (
int
, required.) – The input dimension fo the unit. - hid_dim (
int
, required.) – The hidden dimension fo the unit. - droprate (
float
, required.) – The dropout ratrio.
-
forward
(x)[source]¶ Calculate the output.
Parameters: x ( torch.LongTensor
, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).Returns: output – The output of RNNs. Return type: torch.FloatTensor
.
Initialize hidden states.
- layer_num (
-
class
model_word_ada.basic.
BasicUnit
(unit, input_dim, hid_dim, droprate)[source]¶ The basic recurrent unit for the vanilla stacked RNNs.
Parameters: - unit (
str
, required.) – The type of rnn unit. - input_dim (
int
, required.) – The input dimension fo the unit. - hid_dim (
int
, required.) – The hidden dimension fo the unit. - droprate (
float
, required.) – The dropout ratrio.
-
forward
(x)[source]¶ Calculate the output.
Parameters: x ( torch.LongTensor
, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).Returns: output – The output of RNNs. Return type: torch.FloatTensor
.
Initialize hidden states.
- unit (
model_word_ada.dataset module¶
-
class
model_word_ada.dataset.
EvalDataset
(dataset, sequence_length)[source]¶ Dataset for Language Modeling
Parameters: - dataset (
list
, required.) – The encoded dataset (outputs of preprocess scripts). - sequence_length (
int
, required.) – Sequence Length.
- dataset (
-
class
model_word_ada.dataset.
LargeDataset
(root, range_idx, batch_size, sequence_length)[source]¶ Lazy Dataset for Language Modeling
Parameters: - root (
str
, required.) – The root folder for dataset files. - range_idx (
int
, required.) – The maximum file index for the input files (train_*.pk). - batch_size (
int
, required.) – Batch size. - sequence_length (
int
, required.) – Sequence Length.
-
get_tqdm
(device)[source]¶ construct dataset reader and the corresponding tqdm.
Parameters: device ( torch.device
, required.) – the target device for the dataset loader.
- root (
model_word_ada.densenet module¶
-
class
model_word_ada.densenet.
BasicUnit
(unit, input_dim, increase_rate, droprate)[source]¶ The basic recurrent unit for the densely connected RNNs.
Parameters: - unit (
torch.nn.Module
, required.) – The type of rnn unit. - input_dim (
float
, required.) – The input dimension fo the unit. - increase_rate (
float
, required.) – The hidden dimension fo the unit. - droprate (
float
, required.) – The dropout ratrio.
-
forward
(x)[source]¶ Calculate the output.
Parameters: x ( torch.LongTensor
, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).Returns: output – The output of RNNs. Return type: torch.FloatTensor
.
Initialize hidden states.
- unit (
-
class
model_word_ada.densenet.
DenseRNN
(layer_num, unit, emb_dim, hid_dim, droprate)[source]¶ The multi-layer recurrent networks for the densely connected RNNs.
Parameters: - layer_num (
float
, required.) – The number of layers. - unit (
torch.nn.Module
, required.) – The type of rnn unit. - input_dim (
float
, required.) – The input dimension fo the unit. - hid_dim (
float
, required.) – The hidden dimension fo the unit. - droprate (
float
, required.) – The dropout ratrio.
-
forward
(x)[source]¶ Calculate the output.
Parameters: x ( torch.LongTensor
, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).Returns: output – The output of RNNs. Return type: torch.FloatTensor
.
Initialize hidden states.
- layer_num (
model_word_ada.ldnet module¶
-
class
model_word_ada.ldnet.
BasicUnit
(unit, input_dim, increase_rate, droprate, layer_drop=0)[source]¶ The basic recurrent unit for the densely connected RNNs with layer-wise dropout.
Parameters: - unit (
torch.nn.Module
, required.) – The type of rnn unit. - input_dim (
float
, required.) – The input dimension fo the unit. - increase_rate (
float
, required.) – The hidden dimension fo the unit. - droprate (
float
, required.) – The dropout ratrio. - layer_dropout (
float
, required.) – The layer-wise dropout ratrio.
-
forward
(x, p_out)[source]¶ Calculate the output.
Parameters: - x (
torch.LongTensor
, required.) – the input tensor, of shape (seq_len, batch_size, input_dim). - p_out (
torch.LongTensor
, required.) – the final output tensor for the softmax, of shape (seq_len, batch_size, input_dim).
Returns: - out (
torch.FloatTensor
.) – The undropped outputs of RNNs to the softmax. - p_out (
torch.FloatTensor
.) – The dropped outputs of RNNs to the next_layer.
- x (
Initialize hidden states.
- unit (
-
class
model_word_ada.ldnet.
LDRNN
(layer_num, unit, emb_dim, hid_dim, droprate, layer_drop)[source]¶ The multi-layer recurrent networks for the densely connected RNNs with layer-wise dropout.
Parameters: - layer_num (
float
, required.) – The number of layers. - unit (
torch.nn.Module
, required.) – The type of rnn unit. - input_dim (
float
, required.) – The input dimension fo the unit. - hid_dim (
float
, required.) – The hidden dimension fo the unit. - droprate (
float
, required.) – The dropout ratrio. - layer_dropout (
float
, required.) – The layer-wise dropout ratrio.
-
forward
(x)[source]¶ Calculate the output.
Parameters: x ( torch.LongTensor
, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).Returns: output – The output of RNNs to the Softmax. Return type: torch.FloatTensor
.
Initialize hidden states.
- layer_num (
model_word_ada.LM module¶
-
class
model_word_ada.LM.
LM
(rnn, soft_max, w_num, w_dim, droprate, label_dim=-1, add_relu=False)[source]¶ The language model model.
Parameters: - rnn (
torch.nn.Module
, required.) – The RNNs network. - soft_max (
torch.nn.Module
, required.) – The softmax layer. - w_num (
int
, required.) – The number of words. - w_dim (
int
, required.) – The dimension of word embedding. - droprate (
float
, required) – The dropout ratio. - label_dim (
int
, required.) – The input dimension of softmax.
-
forward
(w_in, target)[source]¶ Calculate the loss.
Parameters: - w_in (
torch.FloatTensor
, required.) – the input tensor, of shape (word_num, input_dim). - target (
torch.FloatTensor
, required.) – the target of the language model, of shape (word_num).
Returns: loss – The NLL loss.
Return type: torch.FloatTensor
.- w_in (
Initialize hidden states.
- rnn (
model_word_ada.utils module¶
-
model_word_ada.utils.
adjust_learning_rate
(optimizer, lr)[source]¶ adjust learning to the the new value.
Parameters: - optimizer (required.) – pytorch optimizer.
- float (
float
, required.) – the target learning rate.
Wraps hidden states in new Variables, to detach them from their history
Parameters: h ( Tuple
orTensors
, required.) – Tuple or Tensors, hidden states.Returns: hidden – detached hidden states Return type: Tuple
orTensors
.
Sequence Labeling¶
model_seq.crf module¶
-
class
model_seq.crf.
CRF
(hidden_dim: int, tagset_size: int, if_bias: bool = True)[source]¶ Conditional Random Field Module
Parameters: - hidden_dim (
int
, required.) – the dimension of the input features. - tagset_size (
int
, required.) – the size of the target labels. - if_bias (
bool
, optional, (default=True).) – whether the linear transformation has the bias term.
-
forward
(feats)[source]¶ calculate the potential score for the conditional random field.
Parameters: feats ( torch.FloatTensor
, required.) – the input features for the conditional random field, of shape (*, hidden_dim).Returns: output – A float tensor of shape (ins_num, from_tag_size, to_tag_size) Return type: torch.FloatTensor
.
- hidden_dim (
-
class
model_seq.crf.
CRFDecode
(y_map: dict)[source]¶ The negative loss for the Conditional Random Field Module
Parameters: y_map ( dict
, required.) – adict
maps from tag string to tag index.-
decode
(scores, mask)[source]¶ find the best path from the potential scores by the viterbi decoding algorithm.
Parameters: - scores (
torch.FloatTensor
, required.) – the potential score for the conditional random field, of shape (seq_len, batch_size, from_tag_size, to_tag_size). - mask (
torch.ByteTensor
, required.) – the mask for the unpadded sentence parts, of shape (seq_len, batch_size).
Returns: output – A LongTensor of shape (seq_len - 1, batch_size)
Return type: torch.LongTensor
.- scores (
-
-
class
model_seq.crf.
CRFLoss
(y_map: dict, average_batch: bool = True)[source]¶ The negative loss for the Conditional Random Field Module
Parameters: - y_map (
dict
, required.) – adict
maps from tag string to tag index. - average_batch (
bool
, optional, (default=True).) – whether the return score would be averaged per batch.
-
forward
(scores, target, mask)[source]¶ calculate the negative log likehood for the conditional random field.
Parameters: - scores (
torch.FloatTensor
, required.) – the potential score for the conditional random field, of shape (seq_len, batch_size, from_tag_size, to_tag_size). - target (
torch.LongTensor
, required.) – the positive path for the conditional random field, of shape (seq_len, batch_size). - mask (
torch.ByteTensor
, required.) – the mask for the unpadded sentence parts, of shape (seq_len, batch_size).
Returns: loss – The NLL loss.
Return type: torch.FloatTensor
.- scores (
- y_map (
model_seq.dataset module¶
-
class
model_seq.dataset.
SeqDataset
(dataset: list, flm_pad: int, blm_pad: int, w_pad: int, c_con: int, c_pad: int, y_start: int, y_pad: int, y_size: int, batch_size: int)[source]¶ Dataset for Sequence Labeling
Parameters: - dataset (
list
, required.) – The encoded dataset (outputs of preprocess scripts). - flm_pad (
int
, required.) – The pad index for the forward language model. - blm_pad (
int
, required.) – The pad index for the backward language model. - w_pad (
int
, required.) – The pad index for the word-level inputs. - c_con (
int
, required.) – The index of connect character token for character-level inputs. - c_pad (
int
, required.) – The pad index for the character-level inputs. - y_start (
int
, required.) – The index of the start label token. - y_pad (
int
, required.) – The index of the pad label token. - y_size (
int
, required.) – The size of the tag set. - batch_size (
int
, required.) – Batch size.
-
batchify
(batch, device)[source]¶ batchify a batch of data and move to a device.
Parameters: - batch (
list
, required.) – a sample from the encoded dataset (outputs of preprocess scripts). - device (
torch.device
, required.) – the target device for the dataset loader.
- batch (
-
construct_index
(dataset)[source]¶ construct index for the dataset.
Parameters: dataset ( list
, required.) – the encoded dataset (outputs of preprocess scripts).
-
get_tqdm
(device)[source]¶ construct dataset reader and the corresponding tqdm.
Parameters: device ( torch.device
, required.) – the target device for the dataset loader.
- dataset (
model_seq.elmo module¶
-
class
model_seq.elmo.
EBUnit
(ori_unit, droprate, fix_rate)[source]¶ The basic recurrent unit for the ELMo RNNs wrapper.
Parameters: - ori_unit (
torch.nn.Module
, required.) – The original module of rnn unit. - droprate (
float
, required.) – The dropout ratrio. - fix_rate (
bool
, required.) – Whether to fix the rqtio.
- ori_unit (
-
class
model_seq.elmo.
ERNN
(ori_drnn, droprate, fix_rate)[source]¶ The multi-layer recurrent networks for the ELMo RNNs wrapper.
Parameters: - ori_drnn (
torch.nn.Module
, required.) – The original module of rnn networks. - droprate (
float
, required.) – The dropout ratrio. - fix_rate (
bool
, required.) – Whether to fix the rqtio.
- ori_drnn (
-
class
model_seq.elmo.
ElmoLM
(ori_lm, backward, droprate, fix_rate)[source]¶ The language model for the ELMo RNNs wrapper.
Parameters: - ori_lm (
torch.nn.Module
, required.) – the original module of language model. - backward (
bool
, required.) – whether the language model is backward. - droprate (
float
, required.) – the dropout ratrio. - fix_rate (
bool
, required.) – whether to fix the rqtio.
-
forward
(w_in, ind=None)[source]¶ Calculate the output.
Parameters: - w_in (
torch.LongTensor
, required.) – the input tensor, of shape (seq_len, batch_size). - ind (
torch.LongTensor
, optional, (default=None).) – the index tensor for the backward language model, of shape (seq_len, batch_size).
Returns: output – The ELMo outputs.
Return type: torch.FloatTensor
.- w_in (
initialize hidden states.
- ori_lm (
model_seq.evaluator module¶
-
class
model_seq.evaluator.
eval_batch
(decoder)[source]¶ Base class for evaluation, provide method to calculate f1 score and accuracy.
Parameters: decoder ( torch.nn.Module
, required.) – the decoder module, which needs to contain theto_span()
method.-
calc_acc_batch
(decoded_data, target_data)[source]¶ update statics for accuracy score.
Parameters: - decoded_data (
torch.LongTensor
, required.) – the decoded best label index pathes. - target_data (
torch.LongTensor
, required.) – the golden label index pathes.
- decoded_data (
-
calc_f1_batch
(decoded_data, target_data)[source]¶ update statics for f1 score.
Parameters: - decoded_data (
torch.LongTensor
, required.) – the decoded best label index pathes. - target_data (
torch.LongTensor
, required.) – the golden label index pathes.
- decoded_data (
-
model_seq.seqlabel module¶
-
class
model_seq.seqlabel.
SeqLabel
(f_lm, b_lm, c_num: int, c_dim: int, c_hidden: int, c_layer: int, w_num: int, w_dim: int, w_hidden: int, w_layer: int, y_num: int, droprate: float, unit: str = 'lstm')[source]¶ Sequence Labeling model augumented with language model.
Parameters: - f_lm (
torch.nn.Module
, required.) – The forward language modle for contextualized representations. - b_lm (
torch.nn.Module
, required.) – The backward language modle for contextualized representations. - c_num (
int
, required.) – The number of characters. - c_dim (
int
, required.) – The dimension of character embedding. - c_hidden (
int
, required.) – The dimension of character hidden states. - c_layer (
int
, required.) – The number of character lstms. - w_num (
int
, required.) – The number of words. - w_dim (
int
, required.) – The dimension of word embedding. - w_hidden (
int
, required.) – The dimension of word hidden states. - w_layer (
int
, required.) – The number of word lstms. - y_num (
int
, required.) – The number of tags types. - droprate (
float
, required) – The dropout ratio. - unit ("str", optional, (default = 'lstm')) – The type of the recurrent unit.
-
forward
(f_c, f_p, b_c, b_p, flm_w, blm_w, blm_ind, f_w)[source]¶ Calculate the output (crf potentials).
Parameters: - f_c (
torch.LongTensor
, required.) – Character-level inputs in the forward direction. - f_p (
torch.LongTensor
, required.) – Ouput position of character-level inputs in the forward direction. - b_c (
torch.LongTensor
, required.) – Character-level inputs in the backward direction. - b_p (
torch.LongTensor
, required.) – Ouput position of character-level inputs in the backward direction. - flm_w (
torch.LongTensor
, required.) – Word-level inputs for the forward language model. - blm_w (
torch.LongTensor
, required.) – Word-level inputs for the backward language model. - blm_ind (
torch.LongTensor
, required.) – Ouput position of word-level inputs for the backward language model. - f_w (
torch.LongTensor
, required.) – Word-level inputs for the sequence labeling model.
Returns: output – A float tensor of shape (sequence_len, batch_size, from_tag_size, to_tag_size)
Return type: torch.FloatTensor
.- f_c (
- f_lm (
-
class
model_seq.seqlabel.
Vanilla_SeqLabel
(f_lm, b_lm, c_num, c_dim, c_hidden, c_layer, w_num, w_dim, w_hidden, w_layer, y_num, droprate, unit='lstm')[source]¶ Sequence Labeling model augumented without language model.
Parameters: - f_lm (
torch.nn.Module
, required.) – forward language modle for contextualized representations. - b_lm (
torch.nn.Module
, required.) – backward language modle for contextualized representations. - c_num (
int
, required.) – number of characters. - c_dim (
int
, required.) – dimension of character embedding. - c_hidden (
int
, required.) – dimension of character hidden states. - c_layer (
int
, required.) – number of character lstms. - w_num (
int
, required.) – number of words. - w_dim (
int
, required.) – dimension of word embedding. - w_hidden (
int
, required.) – dimension of word hidden states. - w_layer (
int
, required.) – number of word lstms. - y_num (
int
, required.) – number of tags types. - droprate (
float
, required) – dropout ratio. - unit ("str", optional, (default = 'lstm')) – type of the recurrent unit.
-
forward
(f_c, f_p, b_c, b_p, flm_w, blm_w, blm_ind, f_w)[source]¶ Calculate the output (crf potentials).
Parameters: - f_c (
torch.LongTensor
, required.) – Character-level inputs in the forward direction. - f_p (
torch.LongTensor
, required.) – Ouput position of character-level inputs in the forward direction. - b_c (
torch.LongTensor
, required.) – Character-level inputs in the backward direction. - b_p (
torch.LongTensor
, required.) – Ouput position of character-level inputs in the backward direction. - flm_w (
torch.LongTensor
, required.) – Word-level inputs for the forward language model. - blm_w (
torch.LongTensor
, required.) – Word-level inputs for the backward language model. - blm_ind (
torch.LongTensor
, required.) – Ouput position of word-level inputs for the backward language model. - f_w (
torch.LongTensor
, required.) – Word-level inputs for the sequence labeling model.
Returns: output – A float tensor of shape (sequence_len, batch_size, from_tag_size, to_tag_size)
Return type: torch.FloatTensor
.- f_c (
- f_lm (
model_seq.seqlm module¶
-
class
model_seq.seqlm.
BasicSeqLM
(ori_lm, backward, droprate, fix_rate)[source]¶ The language model for the dense rnns.
Parameters: - ori_lm (
torch.nn.Module
, required.) – the original module of language model. - backward (
bool
, required.) – whether the language model is backward. - droprate (
float
, required.) – the dropout ratrio. - fix_rate (
bool
, required.) – whether to fix the rqtio.
-
forward
(w_in, ind=None)[source]¶ Calculate the output.
Parameters: - w_in (
torch.LongTensor
, required.) – the input tensor, of shape (seq_len, batch_size). - ind (
torch.LongTensor
, optional, (default=None).) – the index tensor for the backward language model, of shape (seq_len, batch_size).
Returns: output – The ELMo outputs.
Return type: torch.FloatTensor
.- w_in (
initialize hidden states.
- ori_lm (
model_seq.sparse_lm module¶
-
class
model_seq.sparse_lm.
SBUnit
(ori_unit, droprate, fix_rate)[source]¶ The basic recurrent unit for the dense-RNNs wrapper.
Parameters: - ori_unit (
torch.nn.Module
, required.) – the original module of rnn unit. - droprate (
float
, required.) – the dropout ratrio. - fix_rate (
bool
, required.) – whether to fix the rqtio.
- ori_unit (
-
class
model_seq.sparse_lm.
SDRNN
(ori_drnn, droprate, fix_rate)[source]¶ The multi-layer recurrent networks for the dense-RNNs wrapper.
Parameters: - ori_unit (
torch.nn.Module
, required.) – the original module of rnn unit. - droprate (
float
, required.) – the dropout ratrio. - fix_rate (
bool
, required.) – whether to fix the rqtio.
-
forward
(x)[source]¶ Calculate the output.
Parameters: x ( torch.FloatTensor
, required.) – the input tensor, of shape (seq_len, batch_size, input_dim).Returns: output – The ELMo outputs. Return type: torch.FloatTensor
.
- ori_unit (
-
class
model_seq.sparse_lm.
SparseSeqLM
(ori_lm, backward, droprate, fix_rate)[source]¶ The language model for the dense rnns with layer-wise selection.
Parameters: - ori_lm (
torch.nn.Module
, required.) – the original module of language model. - backward (
bool
, required.) – whether the language model is backward. - droprate (
float
, required.) – the dropout ratrio. - fix_rate (
bool
, required.) – whether to fix the rqtio.
-
forward
(w_in, ind=None)[source]¶ Calculate the output.
Parameters: - w_in (
torch.LongTensor
, required.) – the input tensor, of shape (seq_len, batch_size). - ind (
torch.LongTensor
, optional, (default=None).) – the index tensor for the backward language model, of shape (seq_len, batch_size).
Returns: output – The ELMo outputs.
Return type: torch.FloatTensor
.- w_in (
initialize hidden states.
- ori_lm (
model_seq.utils module¶
-
model_seq.utils.
adjust_learning_rate
(optimizer, lr)[source]¶ adjust learning to the the new value.
Parameters: - optimizer (required.) – pytorch optimizer.
- float (
float
, required.) – the target learning rate.
-
model_seq.utils.
log_sum_exp
(vec)[source]¶ log sum exp function.
Parameters: vec ( torch.FloatTensor
, required.) – input vector, of shape(ins_num, from_tag_size, to_tag_size)Returns: sum – log sum exp results, tensor of shape (ins_num, to_tag_size) Return type: torch.FloatTensor
.
Wraps hidden states in new Variables, to detach them from their history
Parameters: h ( Tuple
orTensors
, required.) – Tuple or Tensors, hidden states.Returns: hidden – detached hidden states Return type: Tuple
orTensors
.
\ Sort by:\ best rated\ newest\ oldest\
\\
Add a comment\ (markup):
\``code``
, \ code blocks:::
and an indented block after blank line