机器学习复习笔记之TensorFlow

2019年7月14日 367次阅读来源: 唐十六

TensorFlow是谷歌开发的一个基于图表的通用计算框架，可以用来编写程序。它可以被用来当做一个开发深度学习模型的平台，极大地简化了神经网络的模型构建过程。

术语：

（1）TensorFlow用叫做 tensor 的对象储存数据，而不是以整数、浮点数或者字符串等具体形式存储。

（2）TensorFlow用图 (graph) 来表示计算任务：TensorFlow 的 api 构建在 computational graph 的概念上，它是一种对数学运算过程进行可视化的方法。

（3）TensorFlow在被称之为会话 (Session) 的上下文管理器中执行图。这个 session 负责分配 GPU(s) 和／或 CPU(s)，包括远程计算机的运算。

一、基本操作

1、创建对象、输入数据

（1）创建一个tensor对象式的常量：tf.constant()

a = tf.constant('Hello World!')

在Session中进行计算（操作），用 tf.Session 创建一个session实例即sess。然后用 sess.run() 函数对对象a求值，将返回的结果储存在output中，并打印出来：

with tf.Session() as sess:
    output = sess.run(a)
    print(output)

输入的数据可以是不同维度（几维就加几层中括号）：

# A is a 0-dimensional int32 tensor
A = tf.constant(1234) 
# B is a 1-dimensional int32 tensor
B = tf.constant([123,456,789]) 
 # C is a 2-dimensional int32 tensor
C = tf.constant([ [123,456,789], [222,333,444] ])

——任何操作都可以分两部分，先声明操作，然后在Session中执行这个操作。当然也可以不声明，直接在Session中执行，比如上面的例子可以这样写：

sess.run(tf.constant('Hello World'))

（2）向指定好的对象中喂入数据：tf.placeholder()

指定数据类型：

x = tf.placeholder(tf.string)
with tf.Session() as sess:
    output = sess.run(x, feed_dict={x: 'Hello World'})

——可以看作以dict（字典）的形式往变量中喂入数据。

——喂入 feed_dict 中的数据要与之前指定的 tensor 类型相符。

指定形状：

labels = tf.placeholder(tf.float32, [None, n])

——shape参数可以是一个 1-D integer Tensor or Python array：[]、()、shape=()、shape=[]等传入形式都可以。

——同时指定维数和每维元素个数。比如这里就是设定为二维，二维数据是m×n矩阵形式。

——可以填如None，None 在这里是一个占位符，可以帮助设定一个动态的大小。在运行时，TensorFlow 会接收任何大于 0 的维度值。

指定类型、形状（shape）、名字：

tf.placeholder(tf.float32, shape=(None,image_shape[0],image_shape[1],image_shape[2]), name='x')

——这里输入的是一个四维数据集，包括批次大小，以及图像数据的长、宽、深。

（3）创建一个tensor式的变量：tf.Variable

创建初始值为某特定数值的变量：

x = tf.Variable(5)  #这里是0维

创建初始值为0、元素个数为n的变量：

bias = tf.Variable(tf.zeros(n))  #这里是1维、元素个数为n

创建初始值为正态分布随机数的变量：tf.truncated_normal()

weights = tf.Variable(tf.truncated_normal((m, n)))  #这里是2维

还可以在上面的基础上指定分布的方差：

weight = tf.Variable(tf.truncated_normal(
    [filter_size_height, filter_size_width, color_channels, k_output], mean=0, stddev=0.01))

无论设定初始值为什么，要对变量在Session中进行操作，首先都必须手动初始化它们的状态：用初始化函数初始化所有变量（即将变量归为其初始值）

init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)

2、数学运算的操作

x = tf.add(5, 2)  # 7
x = tf.subtract(10, 4) # 6
y = tf.multiply(2, 5)  # 10

——关于除法运算友两各，div可以直接传入数字，divide必须传入tensor对象或操作。但是div如果不在数字后面加个点，那就是默认整除，而divide则是正常除法（输出浮点值）。tensorflow文档建议用divide，相当于在Python中from __future__ import division。

tf.log()：返回所输入值的自然对数。

tf.matmul(a,b)：矩阵乘法（注意符合维度匹配）

tf.reduce_sum([1, 2, 3, 4, 5])：返回输入序列的元素之和

tf.reduce_mean([1, 2, 3, 4, 5])：返回输入序列的元素平均值

（更多数学函数查看文档）

3、类型转换

将浮点数转换为整数：

tf.cast(tf.constant(2.0), tf.int32)

二、保存和读取变量和模型

训练一个模型的时间很长。但是一旦关闭了 TensorFlow session，所有训练的权重和偏置项都丢失了。如果在之后重新使用这个模型，就需要重新训练！但是，TensorFlow 可以让你通过一个叫 tf.train.Saver 的类把进程保存下来。这个类可以把任何tf.Variable 存到文件系统。

1、保存变量

import tensorflow as tf
# 文件保存路径
save_file = './model.ckpt'
# 两个 Tensor 变量：权重和偏置项
weights = tf.Variable(tf.truncated_normal([2, 3]))
bias = tf.Variable(tf.truncated_normal([3]))
# 用来存取 Tensor 变量的类
saver = tf.train.Saver()
with tf.Session() as sess:
    # 初始化所有变量
    sess.run(tf.global_variables_initializer())
   # 显示变量和权重
    print('Weights:')
    print(sess.run(weights))
    print('Bias:')
    print(sess.run(bias))
    # 保存模型
    saver.save(sess, save_file)

2、加载变量

# 移除之前的权重和偏置项
tf.reset_default_graph()
# 两个变量：权重和偏置项
weights = tf.Variable(tf.truncated_normal([2, 3]))
bias = tf.Variable(tf.truncated_normal([3]))
# 用来存取 Tensor 变量的类
saver = tf.train.Saver()
with tf.Session() as sess:
    # 加载权重和偏置项
    saver.restore(sess, save_file)
    # 显示权重和偏置项
    print('Weight:')
    print(sess.run(weights))
    print('Bias:')
    print(sess.run(bias))

——依然需要在 Python 中创建 weights 和 bias Tensors。tf.train.Saver.restore() 函数把之前保存的数据加载到 weights 和 bias 当中。

——因为 tf.train.Saver.restore() 设定了 TensorFlow 变量，这里不需要调用tf.global_variables_initializer()了。

3、保存模型

（1）从一个模型开始：

# 移除之前的  Tensors 和运算
tf.reset_default_graph()

from tensorflow.examples.tutorials.mnist import input_data
import numpy as np

learning_rate = 0.001
n_input = 784 # MNIST 数据输入 (图片尺寸: 28*28)
n_classes = 10 # MNIST 总计类别 (数字 0-9) 
# 加载 MNIST 数据
mnist = input_data.read_data_sets('.', one_hot=True)

# 特征和标签
features = tf.placeholder(tf.float32, [None, n_input])
labels = tf.placeholder(tf.float32, [None, n_classes])

# 权重和偏置项
weights = tf.Variable(tf.random_normal([n_input, n_classes]))
bias = tf.Variable(tf.random_normal([n_classes]))

# Logits - xW + b
logits = tf.add(tf.matmul(features, weights), bias)

# 定义损失函数和优化器
cost = tf.reduce_mean(\
    tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)\
    .minimize(cost)

# 计算准确率
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

（2）训练模型并保存权重：

import math

save_file = './train_model.ckpt'
batch_size = 128
n_epochs = 100

saver = tf.train.Saver()

# 启动图 
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    # 训练循环 
   for epoch in range(n_epochs):
        total_batch = math.ceil(mnist.train.num_examples / batch_size)

        # 遍历所有 batch 
       for i in range(total_batch):
            batch_features, batch_labels = mnist.train.next_batch(batch_size)
            sess.run(optimizer,feed_dict={features: batch_features, labels: batch_labels})

        # 每运行10个 epoch 打印一次状态 
       if epoch % 10 == 0:
            valid_accuracy = sess.run(accuracy,feed_dict={features: mnist.validation.images,labels: mnist.validation.labels})
            print('Epoch {:<3} - Validation Accuracy: {}'.format(epoch,valid_accuracy))

    # 保存模型
    saver.save(sess, save_file)
    print('Trained Model Saved.')

打印结果：

Epoch 0 – Validation Accuracy: 0.06859999895095825

Epoch 10 – Validation Accuracy: 0.20239999890327454

Epoch 20 – Validation Accuracy: 0.36980000138282776

Epoch 30 – Validation Accuracy: 0.48820000886917114

Epoch 40 – Validation Accuracy: 0.5601999759674072

Epoch 50 – Validation Accuracy: 0.6097999811172485

Epoch 60 – Validation Accuracy: 0.6425999999046326

Epoch 70 – Validation Accuracy: 0.6733999848365784

Epoch 80 – Validation Accuracy: 0.6916000247001648

Epoch 90 – Validation Accuracy: 0.7113999724388123

Trained Model Saved.

4、保存模型

加载训练好的模型

saver = tf.train.Saver()

# 加载图 
with tf.Session() as sess:
    saver.restore(sess, save_file)

    test_accuracy = sess.run(accuracy,feed_dict={features: mnist.test.images, labels: mnist.test.labels})

print('Test Accuracy: {}'.format(test_accuracy))

5、“微调”一个已经训练并保存了的模型

三、用tensorflow创建神经网络

1、传统logistic回归和神经网络

（1）用常量、变量创建函数表示各个常量、变量：

参数w和b：tf.Variable()

学习率等超参数：tf.constant()，如果是需要调参的则用tf.placeholder()

数据集的输入值（特征值）、输出值（预测值）和标签值（真实值）：tf.placeholder()

（2）用数学运算函数表示线性回归公式：

linear=tf.add(tf.matmul(x,w),b)

—— 《机器学习复习笔记之TensorFlow》，这里与吴恩达深度学习的符号声明不同，样本点是行而非列。

（3）构建隐藏层的relu激活函数：

hidden_layer = tf.nn.relu(input)

（4）构建多分类问题的输出层激活函数：

softmax = tf.nn.softmax(input)

（5）构建目标函数和优化器/Optimizer

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y))
# 这里的logits是指softmax输出的对数几率。
optimizer=tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)

（6）在Session中执行操作

with tf.Session() as sess:
    # 初始化变量
    sess.run(tf.global_variables_initializer())
    # 训练循环
    for epoch in range(epochs):
       #载入数据集
        total_batch = int(mnist.train.num_examples/batch_size)
        # 遍历所有 batch（此数据集已经分好批次，只等调用）
        for i in range(total_batch):
            batch_x, batch_y = mnist.train.next_batch(batch_size)
            # 运行优化器进行反向传导、计算 cost（获取 loss 值）
            sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})

——TensorFlow 中的 MNIST 库提供了分批接收数据的能力。调用mnist.train.next_batch()函数返回训练数据的一个子集。

（7）实现mini-batch

mini-batch跟随机梯度下降（SGD）结合在一起用也很有帮助。方法是在每一代训练之前，对数据进行随机混洗，然后创建 mini-batches，对每一个 mini-batch，用梯度下降训练网络权重。因为这些 batches 是随机的，其实是在对每个 batch 做随机梯度下降。

import math
def batches(batch_size, features, labels):
    """  Create batches of features and labels  :param batch_size: The batch size  :param features: List of features  :param labels: List of labels  :return: Batches of (Features, Labels)  """
    assert len(features) == len(labels)
    output_batches = []
    
    sample_size = len(features)
    for start_i in range(0, sample_size, batch_size):
        end_i = start_i + batch_size
        batch = [features[start_i:end_i], labels[start_i:end_i]]
        output_batches.append(batch)
        
    return output_batches

（8）在TensorFlow中实现Dropout

keep_prob = tf.placeholder(tf.float32) # probability to keep units
hidden_layer = tf.add(tf.matmul(features, weights[0]), biases[0])
hidden_layer = tf.nn.relu(hidden_layer)
hidden_layer = tf.nn.dropout(hidden_layer, keep_prob)
logits = tf.add(tf.matmul(hidden_layer, weights[1]), biases[1])

——涉及超参数keep_prob的设定，即任何一个给定单元的留存率。

——keep_prob 可以调整丢弃单元的数量。为了补偿被丢弃的单元，tf.nn.dropout() 把所有保留下来的单元（没有被丢弃的单元）* 1/keep_prob

——在训练时，一个好的keep_prob初始值是0.5。而在测试时，把 keep_prob 值设为1.0 ，这样保留所有的单元，最大化模型的能力。

2、CNN

可以使用 TensorFlow Layers 或 TensorFlow Layers (contrib) 包中的函数来实现该功能。可以只使用 TensorFlow 包中的函数。

（1）构建卷积层

# Output depth
k_output = 64

# Image Properties
image_width = 10
image_height = 10
color_channels = 3

# Convolution filter
filter_size_width = 5
filter_size_height = 5

# Input/Image
input = tf.placeholder(tf.float32,shape=[None, image_height, image_width, color_channels])

# Weight and bias
weight = tf.Variable(tf.truncated_normal([filter_size_height, filter_size_width, color_channels, k_output]))
bias = tf.Variable(tf.zeros(k_output))

# Apply Convolution
conv_layer = tf.nn.conv2d(input, weight, strides=[1, 2, 2, 1], padding='SAME')
# Add bias
conv_layer = tf.nn.bias_add(conv_layer, bias)
# Apply activation function
conv_layer = tf.nn.relu(conv_layer)

（2）构建（最大）池化层（Max Pooling）

...
conv_layer = tf.nn.conv2d(input, weight, strides=[1, 2, 2, 1], padding='SAME')
conv_layer = tf.nn.bias_add(conv_layer, bias)
conv_layer = tf.nn.relu(conv_layer)
# Apply Max Pooling
conv_layer = tf.nn.max_pool(
    conv_layer,
    ksize=[1, 2, 2, 1],
    strides=[1, 2, 2, 1],
    padding='SAME')

tf.nn.max_pool() 函数实现最大池化时， ksize参数是滤波器大小，strides参数是步长。2×2 的滤波器配合 2×2 的步长是常用设定。

ksize 和 strides 参数也被构建为四个元素的列表，每个元素对应 input tensor 的一个维度 ([batch, height, width, channels])，对 ksize 和 strides 来说，batch 和 channel 通常都设置成 1。

将（1）、（2）构建过程封装到函数中：

def conv2d_maxpool(x_tensor, conv_num_outputs, conv_ksize, conv_strides, pool_ksize, pool_strides):
    """  Apply convolution then max pooling to x_tensor  :param x_tensor: TensorFlow Tensor  :param conv_num_outputs: Number of outputs for the convolutional layer  :param conv_ksize: kernal size 2-D Tuple for the convolutional layer  :param conv_strides: Stride 2-D Tuple for convolution  :param pool_ksize: kernal size 2-D Tuple for pool  :param pool_strides: Stride 2-D Tuple for pool  : return: A tensor that represents convolution and max pooling of x_tensor  """
    # TODO: Implement Function
    shape = x_tensor.get_shape().as_list()
    weight = tf.Variable(tf.truncated_normal([*conv_ksize,shape[3],conv_num_outputs], mean=0, stddev=0.01))
    bias = tf.Variable(tf.zeros(conv_num_outputs))
    conv_layer = tf.nn.conv2d(x_tensor, weight, [1,*conv_strides,1], padding='SAME')
    conv_layer = tf.nn.bias_add(conv_layer,bias)
    conv_layer = tf.nn.relu(conv_layer)
    pool_layer = tf.nn.max_pool(conv_layer, [1,*pool_ksize,1], [1,*pool_strides,1], padding='SAME')    
    return pool_layer

（3）构建展开层（Flatten）

def flatten(x_tensor):
    """  Flatten x_tensor to (Batch Size, Flattened Image Size)  : x_tensor: A tensor of size (Batch Size, ...), where ... are the image dimensions.  : return: A tensor of size (Batch Size, Flattened Image Size).  """
    shape = x_tensor.get_shape().as_list()
    dim = np.prod(shape[1:])
    return tf.reshape(x_tensor, [-1, dim])

（4）构建全连接层（Fully-Connected）

def fully_conn(x_tensor, num_outputs):
    """  Apply a fully connected layer to x_tensor using weight and bias  : x_tensor: A 2-D tensor where the first dimension is batch size.  : num_outputs: The number of output that the new tensor should be.  : return: A 2-D tensor where the second dimension is num_outputs.  """
    # TODO: Implement Function
    return tf.contrib.layers.fully_connected(x_tensor, num_outputs)

（5）输出层

def output(x_tensor, num_outputs):
    """  Apply a output layer to x_tensor using weight and bias  : x_tensor: A 2-D tensor where the first dimension is batch size.  : num_outputs: The number of output that the new tensor should be.  : return: A 2-D tensor where the second dimension is num_outputs.  """
    # TODO: Implement Function
    return tf.contrib.layers.fully_connected(x_tensor, num_outputs, activation_fn=None)

（6）构建模型

def conv_net(x, keep_prob):
    """  Create a convolutional neural network model  : x: Placeholder tensor that holds image data.  : keep_prob: Placeholder tensor that hold dropout keep probability.  : return: Tensor that represents logits  """
    # TODO: Apply 1, 2, or 3 Convolution and Max Pool layers
    # Play around with different number of outputs, kernel size and stride
    # Function Definition from Above:
    cp1 = conv2d_maxpool(x, 16, (5,5), (1,1), (2,2), (2,2))
    cp2 = conv2d_maxpool(cp1, 32, (5,5), (1,1), (2,2), (2,2))

    # TODO: Apply a Flatten Layer
    # Function Definition from Above:
    flat = flatten(cp2)
    
    # TODO: Apply 1, 2, or 3 Fully Connected Layers
    # Play around with different number of outputs
    # Function Definition from Above:
    fc1 = fully_conn(flat, 1024)
    fc1 = tf.nn.dropout(fc1, keep_prob)
    fc2 = fully_conn(fc1, 128)
    fc2 = tf.nn.dropout(fc2, keep_prob)

    # TODO: Apply an Output Layer
    # Set this to the number of classes
    # Function Definition from Above:
    logits = output(fc2, 10)
    
    # TODO: return output
    return logits

（7）用上面的函数构建模型

# Remove previous weights, bias, inputs, etc..
tf.reset_default_graph()

# Inputs
x = neural_net_image_input((32, 32, 3))
y = neural_net_label_input(10)
keep_prob = neural_net_keep_prob_input()

# Model
logits = conv_net(x, keep_prob)

# Name logits Tensor, so that is can be loaded from disk after training
logits = tf.identity(logits, name='logits')

# Loss and Optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y))
optimizer = tf.train.AdamOptimizer().minimize(cost)

# Accuracy
correct_pred = tf.equal(tf.argmax(logits, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32), name='accuracy')

（8）训练

def train_neural_network(session, optimizer, keep_probability, feature_batch, label_batch):
    """  Optimize the session on a batch of images and labels  : session: Current TensorFlow session  : optimizer: TensorFlow optimizer function  : keep_probability: keep probability  : feature_batch: Batch of Numpy image data  : label_batch: Batch of Numpy label data  """
    session.run(optimizer, feed_dict={x:feature_batch, y:label_batch, keep_prob:keep_probability})

（9）显示状态

构建 print_stats 函数来打印 loss 值及验证准确率。使用全局的变量 valid_features 及 valid_labels 来计算验证准确率。设定保留概率为 1.0 来计算 loss 值及验证准确率。

def print_stats(session, feature_batch, label_batch, cost, accuracy):
    """  Print information about loss and validation accuracy  : session: Current TensorFlow session  : feature_batch: Batch of Numpy image data  : label_batch: Batch of Numpy label data  : cost: TensorFlow cost function  : accuracy: TensorFlow accuracy function  """
    # TODO: Implement Function
    loss = session.run(cost, feed_dict={x:feature_batch, y:label_batch, keep_prob:1.0})
    v_acc = session.run(accuracy, feed_dict={x:valid_features, y:valid_labels, keep_prob:1.0})
    print ('Loss={:.5f}'.format(loss), 'Accuracy={:.5f}'.format(v_acc))

（10）设定超参数

epochs = 10
batch_size = 256
keep_probability = 0.8

（11）先在单批次训练

print('Checking the Training on a Single Batch...')
with tf.Session() as sess:
    # Initializing the variables
    sess.run(tf.global_variables_initializer())
    
    # Training cycle
    for epoch in range(epochs):
        batch_i = 1
        for batch_features, batch_labels in helper.load_preprocess_training_batch(batch_i, batch_size):
            train_neural_network(sess, optimizer, keep_probability, batch_features, batch_labels)
        print('Epoch {:>2}, CIFAR-10 Batch {}: '.format(epoch + 1, batch_i), end='')
        print_stats(sess, batch_features, batch_labels, cost, accuracy)

（12）在整个数据集上训练

save_model_path = './image_classification'

print('Training...')
with tf.Session() as sess:
    # Initializing the variables
    sess.run(tf.global_variables_initializer())
    
    # Training cycle
    for epoch in range(epochs):
        # Loop over all batches
        n_batches = 5
        for batch_i in range(1, n_batches + 1):
            for batch_features, batch_labels in helper.load_preprocess_training_batch(batch_i, batch_size):
                train_neural_network(sess, optimizer, keep_probability, batch_features, batch_labels)
            print('Epoch {:>2}, CIFAR-10 Batch {}: '.format(epoch + 1, batch_i), end='')
            print_stats(sess, batch_features, batch_labels, cost, accuracy)
            
    # Save Model
    saver = tf.train.Saver()
    save_path = saver.save(sess, save_model_path)

（13）测试集测试

%matplotlib inline
%config InlineBackend.figure_format = 'retina'

import tensorflow as tf
import pickle
import helper
import random

# Set batch size if not already set
try:
    if batch_size:
        pass
except NameError:
    batch_size = 64

save_model_path = './image_classification'
n_samples = 4
top_n_predictions = 3

def test_model():
    """  Test the saved model against the test dataset  """

    test_features, test_labels = pickle.load(open('preprocess_training.p', mode='rb'))
    loaded_graph = tf.Graph()

    with tf.Session(graph=loaded_graph) as sess:
        # Load model
        loader = tf.train.import_meta_graph(save_model_path + '.meta')
        loader.restore(sess, save_model_path)

        # Get Tensors from loaded model
        loaded_x = loaded_graph.get_tensor_by_name('x:0')
        loaded_y = loaded_graph.get_tensor_by_name('y:0')
        loaded_keep_prob = loaded_graph.get_tensor_by_name('keep_prob:0')
        loaded_logits = loaded_graph.get_tensor_by_name('logits:0')
        loaded_acc = loaded_graph.get_tensor_by_name('accuracy:0')
        
        # Get accuracy in batches for memory limitations
        test_batch_acc_total = 0
        test_batch_count = 0
        
        for train_feature_batch, train_label_batch in helper.batch_features_labels(test_features, test_labels, batch_size):
            test_batch_acc_total += sess.run(
                loaded_acc,
                feed_dict={loaded_x: train_feature_batch, loaded_y: train_label_batch, loaded_keep_prob: 1.0})
            test_batch_count += 1

        print('Testing Accuracy: {}\n'.format(test_batch_acc_total/test_batch_count))

        # Print Random Samples
        random_test_features, random_test_labels = tuple(zip(*random.sample(list(zip(test_features, test_labels)), n_samples)))
        random_test_predictions = sess.run(
            tf.nn.top_k(tf.nn.softmax(loaded_logits), top_n_predictions),
            feed_dict={loaded_x: random_test_features, loaded_y: random_test_labels, loaded_keep_prob: 1.0})
        helper.display_image_predictions(random_test_features, random_test_labels, random_test_predictions)


test_model()

参考：

优达学城：机器学习课程

吴恩达深度学习课程

    原文作者：唐十六
    原文地址: https://zhuanlan.zhihu.com/p/34542056
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。