用RNN进行mnist分类

2023年6月22日 213次阅读来源: 迅速傅里叶变换

RNN与LSTM

RNN网络是在传统神经网络的基础上加入了记忆的成分。对于RNN模型来说，序列被看做一系列随着时间步长递进的事件序列。这里的时间步长并不是真实世界中所指的时间，而是指序列中的位置。RNN模型的特殊结构可以让他处理相互依赖的时间序列及变长数据。长期依赖对于文本理解是不可回避的问题，但普通RNN结构并不能很好的处理这个问题，由于RNN的参数共享，在状态传递的过程中会发生梯度消失或爆炸的问题。LSTM就是为了解决长期依赖问题而产生的。与普通RNN相比，最主要的改进就是多出了三个门控制器：输入门、输出门、遗忘门。

有关RNN与LSTM的具体数学推导可见相关技术博客，这里不做详细阐述。

分类

这里用mnist数据集来做RNN的分类。RNN通常的输入是三维张量[batch_size, step_time, cell.input_size]，这里把28*28的图片的每一行作为输入，28行作为step_time,用128张图片作为一个batch。

首先导入mnist数据集，设置超参数和占位符。

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets('MNIST_data/', one_hot=True)

lr = 0.001
n_input = 28
n_step = 28
n_digit = 10
n_cell = 128
n_batch = 128
train_times = 100000

x = tf.placeholder(tf.float32, [None, n_step, n_input])
y = tf.placeholder(tf.float32, [None, n_digit])

这里设计LSTM网络来对图片分类，LSTM的核心是一个隐藏的神经层cell，包括各种门的参数和激活函数，在cell的前后，还各需要输入和输出的网络层。由于输入是三维的张量，在进行输入时，需要将其reshape成二维张量[batch_size*step_time, cell.input_size],再进行权重计算。

x_new = tf.reshape(x, [-1, n_input])
cell_in = tf.layers.dense(x_new, n_cell)
cell_in = tf.reshape(cell_in, [-1, n_step, n_cell])

现在TensorFlow1.2将隐层网络的设计进行了封装，可以直接调用tf.layers.dense.如下：

dense(
    inputs,
    units,
    activation=None,
    use_bias=True,
    kernel_initializer=None,
    bias_initializer=tf.zeros_initializer(),
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    trainable=True,
    name=None,
    reuse=None
)

inputs是输入的张量，units是神经元的个数，activation是激活函数，默认没有激活函数。这里只需要将输入进行线性组合，所以不需要激活函数。

然后设计LSTM cell。

cell = tf.contrib.rnn.BasicLSTMCell(n_cell)
# Args:

# num_units: int, The number of units in the LSTM cell.
# forget_bias: float, The bias added to forget gates (see above).
# state_is_tuple: If True, accepted and returned states are 2-tuples of the c_state and m_state.
#   If False, they are concatenated along the column axis. The latter behavior will soon be deprecated.
# activation: Activation function of the inner states. Default: tanh.
# reuse: (optional) Python boolean describing whether to reuse variables
# in an existing scope. If not True, and the existing scope already has
# the given variables, an error is raised.

init_state = cell.zero_state(n_batch, dtype=tf.float32)
output, state = tf.nn.dynamic_rnn(
    cell, cell_in, initial_state=init_state, time_major=False)

调用tf.contrib.rnn.BasicLSTMCell设计cell。num_units是cell中神经元的个数，forget_bias默认为1，表示遗忘门的初始值为1，表示遗忘之前的输入联系。state_is_tuple默认为true，表示输出的state是一个tuple，包含两个list，state[0]是cell中的状态，cell[1]是输出的状态。开始时需要初始化状态，如第二行代码，然后调用tf.nn.dynamic_rnn得到RNN层的结果，output的shape是[batch_size, step_time, cell]，state的shape是[batch_size, cell],cell表示RNN中的神经元个数。

cell后再设计一个输出层，与上类似

cell_out = tf.layers.dense(state[1], n_digit)

这里，使用的是state[1],也可以使用output，但是output是三维张量，对应了每个step的输出，所以需要shape之后再用output[-1].

网络层设计好之后，就是模型的代价函数设计，训练和评估了。

loss = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=cell_out))
train = tf.train.AdamOptimizer(lr).minimize(loss)

acc = tf.reduce_mean(
    tf.cast(tf.equal(tf.argmax(y, 1), tf.argmax(cell_out, 1)), tf.float32))

with tf.Session() as sess:
    tf.global_variables_initializer().run()
    # print(sess.run(output, {x: mnist.train.images[
    #       :128].reshape([128, n_step, n_input])}).shape)
    # output.shape:n_batch,n_step,n_cell >>= time_major=False

    # print(sess.run(state[1], {x: mnist.train.images[
    #        :128].reshape([128, n_step, n_input])}).shape)
    # state[1].shape:n_batch,n_cell

    for i in range(train_times):
        xs, ys = mnist.train.next_batch(n_batch)
        xs = xs.reshape([n_batch, n_step, n_input])
        sess.run(train, {x: xs, y: ys})
        if i % 200 == 0:
            print(sess.run(acc, {x: xs, y: ys}))

note

在图片分类时，由于每一张图片前后没有联系，所以在初始化状态，使初始状态为0后，并不需要改变，但在一些时间序列问题上，每一个batch相互联系，所以在第一次batch初始化状态后，使后一次的状态是前一次的输出状态state。
RNN的输入是[batch_size, step_time, cell.input_size]，在每一次batch后得到一个输出结果[batch_size,output],若选择dynamic_rnn return的output进行下一步计算，则输出结果为[batch_size, step_time, output]，选择output[-1]时，结果与用state[1]一致。

    原文作者：迅速傅里叶变换
    原文地址: https://www.jianshu.com/p/f1fb8d6dd522
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。