如何在pytorch中使用带有相同标签的多个可变长度输入的pack_padded_sequence

我有一个模型,它带有三个带有相同标签的可变长度输入.我有办法以某种方式使用pack_padded_sequence吗?如果是这样,我应该如何排序我的序列?

例如,

a = (([0,1,2], [3,4], [5,6,7,8]), 1) # training data is in length 3,2,4; label is 1
b = (([0,1], [2], [6,7,8,9,10]), 1)

a和b都将被输入三个分离的LSTM,结果将被合并以预测目标.

最佳答案 让我们一步一步来做.

Input Data Processing

a = (([0,1,2], [3,4], [5,6,7,8]), 1)

# store length of each element in an array
len_a = np.array([len(a) for a in a[0]]) 
variable_a  = np.zeros((len(len_a), np.amax(len_a)))
for i, a in enumerate(a[0]):
    variable_a[i, 0:len(a)] = a

vocab_size = len(np.unique(variable_a))
Variable(torch.from_numpy(variable_a).long())
print(variable_a)

它打印:

Variable containing:
 0  1  2  0
 3  4  0  0
 5  6  7  8
[torch.DoubleTensor of size 3x4]

Defining embedding and RNN layer

现在,假设我们有一个Embedding和RNN图层类,如下所示.

class EmbeddingLayer(nn.Module):

    def __init__(self, input_size, emsize):
        super(EmbeddingLayer, self).__init__()
        self.embedding = nn.Embedding(input_size, emsize)

    def forward(self, input_variable):
        return self.embedding(input_variable)


class Encoder(nn.Module):

    def __init__(self, input_size, hidden_size, bidirection):
        super(Encoder, self).__init__()
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.bidirection = bidirection
        self.rnn = nn.LSTM(self.input_size, self.hidden_size, batch_first=True, 
                                    bidirectional=self.bidirection)

    def forward(self, sent_variable, sent_len):
        # Sort by length (keep idx)
        sent_len, idx_sort = np.sort(sent_len)[::-1], np.argsort(-sent_len)
        idx_unsort = np.argsort(idx_sort)

        idx_sort = torch.from_numpy(idx_sort)
        sent_variable = sent_variable.index_select(0, Variable(idx_sort))

        # Handling padding in Recurrent Networks
        sent_packed = nn.utils.rnn.pack_padded_sequence(sent_variable, sent_len, batch_first=True)
        sent_output = self.rnn(sent_packed)[0]
        sent_output = nn.utils.rnn.pad_packed_sequence(sent_output, batch_first=True)[0]

        # Un-sort by length
        idx_unsort = torch.from_numpy(idx_unsort)
        sent_output = sent_output.index_select(0, Variable(idx_unsort))

        return sent_output

Embed and encode the processed input data

我们可以按如下方式嵌入和编码输入.

emb = EmbeddingLayer(vocab_size, 50)
enc = Encoder(50, 100, False, 'LSTM')

emb_a = emb(variable_a)
enc_a = enc(emb_a, len_a)

如果你打印enc_a的大小,你将获得torch.Size([3,4,100]).我希望你理解这种形状的意义.

请注意,上面的代码仅在CPU上运行.

点赞