使用tensorflow中构建您的第一个神经网络(以手势图片识别数据集为例)
1.数据集的概括和重构函数
1.1.数据导入
# Loading the dataset
X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_dataset()
load_dataset函数的实现
def load_dataset():
train_dataset = h5py.File('datasets/train_signs.h5', "r")
train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels
test_dataset = h5py.File('datasets/test_signs.h5', "r")
test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels
classes = np.array(test_dataset["list_classes"][:]) # the list of classes
train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))
return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes
已经整理好的数据集能够直接调用load_dataset()
1.2 数据预览
import matplotlib.pyplot as plt
# Example of a picture
index = 0
plt.imshow(X_train_orig[index])
# squeeze 函数:从数组的形状中删除单维度条目,即把shape中为1的维度去掉
print ("y = " + str(np.squeeze(Y_train_orig[:, index])))
1.3数据处理
往往数据集中的数据需要进行处理才能让神经网路进行训练,修改输入数据的维度和标准化是进行训练的必要操作。
实例:
# Flatten the training and test images
X_train_flatten = X_train_orig.reshape(X_train_orig.shape[0], -1).T
X_test_flatten = X_test_orig.reshape(X_test_orig.shape[0], -1).T
# Normalize image vectors
X_train = X_train_flatten/255.
X_test = X_test_flatten/255.
# Convert training and test labels to one hot matrices
Y_train = convert_to_one_hot(Y_train_orig, 6)
Y_test = convert_to_one_hot(Y_test_orig, 6)
1.4 读取原始数据
主要是用于模型训练完成之后来进行实际使用
import scipy
from PIL import Image
from scipy import ndimage
## START CODE HERE ## (PUT YOUR IMAGE NAME)
my_image = "thumbs_up.jpg"
## END CODE HERE ##
# We preprocess your image to fit your algorithm.
fname = "images/" + my_image
image = np.array(ndimage.imread(fname, flatten=False))
my_image = scipy.misc.imresize(image, size=(64,64)).reshape((1, 64*64*3)).T
my_image_prediction = predict(my_image, parameters)
plt.imshow(image)
print("Your algorithm predicts: y = " + str(np.squeeze(my_image_prediction)))
由上例可知,ndimage.imread(fname, flatten=False) imread在SciPy 1.0.0中已弃用,将在1.2.0中删除。 官网是建议使用imageio.imread,索性在这里列出三种Python读入图片的方式
# 使用opencv库:
import cv2
cv_im = cv2.imread("img/path.jpg")
# 使用PIL库中的Image模块:
from PIL import Image
pil_im = Image.open("img/path.jpg")
# 使用imageio模块
import imageio
io_im = imageio.imread("img/path.jpg")
也就是说,无论用哪一种方式进行图片的读取,都可以使用np.array()
进行转化,然后使用cv2
的相关函数进行处理。
2.模型初始化
2.1 X,Y的初始化函数
tf.placeholder(dtype, shape=None, name=None)
dtype:要进给的张量中的元素类型。shape:要进给的张量的形状。name:操作的名称(可选)。
实例:
x = tf.placeholder(tf.float32, shape=(1024, 1024))
要指定占位符(placeholder)的值,可以使用feed_dict变量传入值。
# Change the value of x in the feed_dict
x = tf.placeholder(tf.int64, name = 'x')
print(sess.run(2 * x, feed_dict = {x: 3}))
sess.close()
# 6
2.2 W,b的初始化函数
tf.Variable(initializer, name)
initializer是初始化参数,可以有tf.random_normal,tf.constant,tf.constant等,name就是变量的名字。
实例:
W = tf.Variable(tf.random_normal([in_size, out_size]))
b = tf.Variable(tf.zeros([1, out_size]) + 0.01)
编写函数进行多层w,b的初始化
def initialize_parameters():
tf.set_random_seed(1)
W1 = tf.get_variable("W1", [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed = 1))
b1 = tf.get_variable("b1", [25,1], initializer = tf.zeros_initializer())
W2 = tf.get_variable("W2", [12,25], initializer = tf.contrib.layers.xavier_initializer(seed = 1))
b2 = tf.get_variable("b2", [12,1], initializer = tf.zeros_initializer())
W3 = tf.get_variable("W3",[6,12], initializer = tf.contrib.layers.xavier_initializer(seed = 1))
b3 = tf.get_variable("b3", [6,1], initializer = tf.zeros_initializer())
### END CODE HERE ###
parameters = {"W1": W1,
"b1": b1,
"W2": W2,
"b2": b2,
"W3": W3,
"b3": b3}
return parameters
2.3 神经网络的前向传播
常用函数 矩阵乘法 tf.matmul(a, b)|矩阵加法 tf.add(a,b)|np.random.randn(…) 随机初始化,用于w和b的初始化|tf.nn.relu(z1)激活函数
实例:
Z1 = tf.add(tf.matmul(W1, X), b1) # Z1 = np.dot(W1, X) + b1
A1 = tf.nn.relu(Z1) # A1 = relu(Z1)
Z2 = tf.add(tf.matmul(W2, A1), b2) # Z2 = np.dot(W2, a1) + b2
A2 = tf.nn.relu(Z2) # A2 = relu(Z2)
Z3 = tf.add(tf.matmul(W3, A2), b3) # Z3 = np.dot(W3,Z2) +
2.4 神经网络损失
实例:
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = logits, labels = labels))
2.4.1 计算logits和之间的softmax交叉熵labels tf.nn.softmax_cross_entropy_with_logits_v2(_sentinel=None,labels=None,logits=None,dim=-1,name=None)
实例:
entropy = tf.nn.softmax_cross_entropy_with_logits(labels=Y, logits=prediction)
2.4.2 计算张量维度的元素平均值 tf.reduce_mean(input_tensor,axis=None,keepdims=None,name=None,reduction_indices=None,keep_dims=None)
实例:
x = tf.constant([[1., 1.], [2., 2.]])
tf.reduce_mean(x) # 1.5
tf.reduce_mean(x, 0) # [1.5, 1.5]
tf.reduce_mean(x, 1) # [1., 2.]
# 常用来计算损失
loss = tf.reduce_mean(entropy)
2.5 神经网络后向传播和参数更新
2.5.1 实现梯度下降算法的优化器。
实例:
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
2.5.2 优化器运行
_ , c = sess.run([optimizer, cost], feed_dict={X: minibatch_X, Y: minibatch_Y})
注:在编码时,我们经常使用_作为“一次性”变量来存储我们以后不需要使用的值。这里,_接受优化器的评估值,我们不需要它(并且c取成本变量的值)。
3.模型构建
3.1 返回初始化全局变量的Op init=tf.initialize_all_variables(),进行图操作的必须操作
# Initialize all the variables
init = tf.global_variables_initializer()
# Start the session to compute the tensorflow graph
with tf.Session() as sess:
# Run the initialization
sess.run(init)
3.2.用于运行TensorFlow操作的类
实例:
# Build a graph.
a = tf.constant(5.0)
b = tf.constant(6.0)
c = a * b
# Launch the graph in a session.
# Evaluate the tensor `c`.
with tf.Session() as sess:
sess.run(c)
sess.run(fetches,feed_dict=None,options=None, run_metadata=None)
feed_dict相当输入值,fetches是输入值获得之后更新的tensor
preds = sess.run(prediction, feed_dict={X: X_batch, Y: Y_batch})
模型训练实例:
# Do the training loop
for epoch in range(num_epochs):
epoch_cost = 0. # Defines a cost related to an epoch
num_minibatches = int(m / minibatch_size) # number of minibatches of size minibatch_size in the train set
seed = seed + 1
minibatches = random_mini_batches(X_train, Y_train, minibatch_size, seed)
for minibatch in minibatches:
# Select a minibatch
(minibatch_X, minibatch_Y) = minibatch
# IMPORTANT: The line that runs the graph on a minibatch.
# Run the session to execute the "optimizer" and the "cost", the feedict should contain a minibatch for (X,Y).
### START CODE HERE ### (1 line)
_ , minibatch_cost = sess.run([optimizer, cost], feed_dict = {X: minibatch_X, Y: minibatch_Y})
### END CODE HERE ###
epoch_cost += minibatch_cost / num_minibatches
# Print the cost every epoch
if print_cost == True and epoch % 100 == 0:
print ("Cost after epoch %i: %f" % (epoch, epoch_cost))
if print_cost == True and epoch % 5 == 0:
costs.append(epoch_cost)
4.模型评估
4.1使用matplotlib绘画损失曲线图
import matplotlib.pyplot as plt
# plot the cost
plt.plot(np.squeeze(costs))
plt.ylabel('cost')
plt.xlabel('iterations (per tens)')
plt.title("Learning rate =" + str(learning_rate))
plt.show()
4.2 模型准确率
# Calculate the correct predictions
correct_prediction = tf.equal(tf.argmax(Z3), tf.argmax(Y))
# Calculate accuracy on the test set
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print ("Train Accuracy:", accuracy.eval({X: X_train, Y: Y_train}))
print ("Test Accuracy:", accuracy.eval({X: X_test, Y: Y_test}))
5.其余常用函数
1.创建一个恒定的张量
tf.constant( value, dtype=None, shape=None, name=’Const’,verify_shape=False)
很多时候直接是用于测试上。
实例:
tensor = tf.constant(-1.0, shape=[2, 3])
2.计算张量维度的元素总和
tf.reduce_sum(input_tensor,axis=None,keepdims=None,name=None,reduction_indices=None)(官网不推荐使用keep_dims已经删除)
x = tf.constant([[1, 1, 1], [1, 1, 1]])
tf.reduce_sum(x) # 6
tf.reduce_sum(x, 0) # [2, 2, 2]
tf.reduce_sum(x, 1) # [3, 3]
tf.reduce_sum(x, 1, keepdims=True) # [[3], [3]]
tf.reduce_sum(x, [0, 1]) # 6
3.将张量转换为新类型、
tf.cast( x,dtype,name=None)
x = tf.constant([1.8, 2.2], dtype=tf.float32)
tf.cast(x, tf.int32) # [1, 2], dtype=tf.int32
4.将给定的值转为Tensor
tf.convert_to_tensor(value,dtype=None,name=None, preferred_dtype=None)
在Python中编写新操作时,此函数非常有用。所有标准的Python操作构造函数都将此函数应用于它们的每个Tensor值输入,这允许那些操作除了Tensor
对象之外还接受numpy数组,Python列表和标量。
import numpy as np
def my_func(arg):
arg = tf.convert_to_tensor(arg, dtype=tf.float32)
return tf.matmul(arg, arg) + arg
# The following calls are equivalent.
value_1 = my_func(tf.constant([[1.0, 2.0], [3.0, 4.0]]))
value_2 = my_func([[1.0, 2.0], [3.0, 4.0]])
value_3 = my_func(np.array([[1.0, 2.0], [3.0, 4.0]], dtype=np.float32))