我正在尝试在Tensorflow中创建一个LSTM网络,我在术语/基础知识上迷失了方向.我有n个时间序列示例,所以X = xn,其中
xi = [[x11x12,x13],…,[xm1xm2,xm3]],其中xii是浮点数.首先,我想训练一个给出序列开始的模型([x11x12,x13])我可以预测序列的其余部分.然后我希望包含一个分类器来预测每个xi所属的二进制类.
所以我的问题是我在开始时提供什么并拉出模型的结尾?到目前为止,我有一些看起来像下面的东西
class ETLSTM(object):
"""docstring for ETLSTM"""
def __init__(self, isTraining, config):
super(ETLSTM, self).__init__()
# This needs to be tidied
self.batchSize = batchSize = config.batchSize
self.numSteps = numSteps = config.numSteps
self.numInputs = numInputs = config.numInputs
self.numLayers = numLayers = config.numLayers
lstmSize = config.lstm_size
DORate = config.keep_prob
self.input_data = tf.placeholder(tf.float32, [batchSize, numSteps,
numInputs])
self.targets = tf.placeholder(tf.float32, [batchSize, numSteps,
numInputs])
lstmCell = rnn_cell.BasicLSTMCell(lstmSize, forgetbias=0.0)
if(isTraining and DORate < 1):
lstmCell = tf.nn.rnn_cell.DropoutWrapper(lstmCell,
output_keep_prob=DORate)
cell = tf.nn.rnn_cell.MultiRNNCell([lstmCell]*numLayers)
self._initial_state = cell.zero_state(batchSize, tf.float32)
# This won't work with my data, need to find what goes in...
with tf.device("/cpu:0"):
embedding = tf.get_variable("embedding", [vocab_size, size])
inputs = tf.nn.embedding_lookup(embedding, self._input_data)
if(isTraining and DORate < 1):
inputs = tf.nn.dropout(inputs, DORate)
编辑:
具体来说,如何完成__init__功能,使其与我的数据兼容?
最佳答案 到目前为止,RNN预测给定从1到N的值的N 1的值. (LSTM只是实现RNN小区的一种方式.)
简短的回答是:
>在完整序列[[x11x12,x13],…,[xm1xm2,xm3]]上使用反向传播训练模型
>在序列开始时向前运行训练好的模型[x11x12,x13,…]然后从模型中采样以预测序列的其余部分[xm1xm2,xm3,…].
更长的答案是:
您的示例仅显示模型的初始化.您还需要实现训练函数来运行反向传播以及预测结果的样本函数.
以下代码片段是mix&匹配,仅供参考……
对于训练,只需在数据迭代器中开始休息即可完成整个序列.
例如,在示例代码tensorflow / models / rnn / ptb_word_lm.py中,训练循环计算针对目标的input_data批量的成本函数(input_data向前移动一步)
# compute a learning rate decay
session.run(tf.assign(self.learning_rate_variable, learning_rate))
logger.info("Epoch: %d Learning rate: %.3f" % (i + 1, session.run(self.learning_rate_variable)))
"""Runs the model on the given data."""
epoch_size = ((len(training_data) // self.batch_size) - 1) // self.num_steps
costs = 0.0
iters = 0
state = self.initial_state.eval()
for step, (x, y) in enumerate(self.data_iterator(training_data, self.batch_size, self.num_steps)):
# x and y should have shape [batch_size, num_steps]
cost, state, _ = session.run([self.cost_function, self.final_state, self.train_op],
{self.input_data: x,
self.targets: y,
self.initial_state: state})
costs += cost
iters += self.num_steps
请注意,tensorflow / models / rnn / reader.py中的数据迭代器将输入数据返回为“x”,将目标称为“y”,它们只是从x向前移动了一步. (您需要创建一个这样的数据迭代器来打包您的训练序列集.)
def ptb_iterator(raw_data, batch_size, num_steps):
raw_data = np.array(raw_data, dtype=np.int32)
data_len = len(raw_data)
batch_len = data_len // batch_size
data = np.zeros([batch_size, batch_len], dtype=np.int32)
for i in range(batch_size):
data[i] = raw_data[batch_len * i:batch_len * (i + 1)]
epoch_size = (batch_len - 1) // num_steps
if epoch_size == 0:
raise ValueError("epoch_size == 0, decrease batch_size or num_steps")
for i in range(epoch_size):
x = data[:, i*num_steps:(i+1)*num_steps]
y = data[:, i*num_steps+1:(i+1)*num_steps+1]
yield (x, y)
训练之后,你向前运行模型,通过在序列的开头输入start_x = [X1,X2,X3,…]来对序列进行预测…这个片段假定代表类的二进制值,你必须调整浮点值的采样函数.
def sample(self, sess, num=25, start_x):
# return state tensor with batch size 1 set to zeros, eval
state = self.rnn_layers.zero_state(1, tf.float32).eval()
# run model forward through the start of the sequence
for char in start_x:
# create a 1,1 tensor/scalar set to zero
x = np.zeros((1, 1))
# set to the vocab index
x[0, 0] = char
# fetch: final_state
# input_data = x, initial_state = state
[state] = sess.run([self.final_state], {self.input_data: x, self.initial_state:state})
def weighted_pick(weights):
# an array of cummulative sum of weights
t = np.cumsum(weights)
# scalar sum of tensor
s = np.sum(weights)
# randomly selects a value from the probability distribution
return(int(np.searchsorted(t, np.random.rand(1)*s)))
# PREDICT REST OF SEQUENCE
rest_x = []
# get last character in init
char = start_x[-1]
# sample next num chars in the sequence after init
score = 0.0
for n in xrange(num):
# init input to zeros
x = np.zeros((1, 1))
# lookup character index
x[0, 0] = char
# probs = tf.nn.softmax(self.logits)
# fetch: probs, final_state
# input_data = x, initial_state = state
[probs, state] = sess.run([self.output_layer, self.final_state], {self.input_data: x, self.initial_state:state})
p = probs[0]
logger.info("output=%s" % np.shape(p))
# sample = int(np.random.choice(len(p), p=p))
# select a random value from the probability distribution
sample = weighted_pick(p)
score += p[sample]
# look up the key with the index
logger.debug("sample[%d]=%d" % (n, sample))
pred = self.vocabulary[sample]
logger.debug("pred=%s" % pred)
# add the car to the output
rest_x.append(pred)
# set the next input character
char = pred
return rest_x, score