二、PyTorch 入门实战—Variable(转)

目录

一、概念

二、Variable的创建和使用

三、标量求导计算图

四、矩阵求导计算图

五、Variable放到GPU上执行

六、Variable转Numpy与Numpy转Variable

七、Variable总结

 

一、概念

1.Numpy里没有Variable这个概念,如果大家学过TensorFlow就会知道,Variable提供了自动求导的功能。

2.Variable需要放进一个计算图中,然后进行前后向传播和自动求导。

3.Variable的属性有三个:

  • data:Variable里Tensor变量的数值
  • grad:Variable反向传播的梯度
  • grad_fn:得到Variable的操作

 

二、Variable的创建和使用

1.我们首先创建一个空的Variable:

import torch #创建Variable
a = torch.autograd.Variable() print(a)

结果如下:

 《二、PyTorch 入门实战—Variable(转)》                                                                             

可以看到默认的类型为Tensor

2.那么,我们如果需要给Variable变量赋值,那么就一定是Tensor类型,例如:

b = torch.autograd.Variable(torch.Tensor([[1, 2], [3, 4],[5, 6], [7, 8]])) print(b)

代码变为:

import torch #创建Variable
a = torch.autograd.Variable() print(a) b = torch.autograd.Variable(torch.Tensor([[1, 2], [3, 4],[5, 6], [7, 8]])) print(b)

结果为:

  《二、PyTorch 入门实战—Variable(转)》                                                                                 

3.第一章提到了Variable的三个属性,我们依次打印它们:

import torch #创建Variable
a = torch.autograd.Variable() print(a) b = torch.autograd.Variable(torch.Tensor([[1, 2], [3, 4],[5, 6], [7, 8]])) print(b) print(b.data) print(b.grad) print(b.grad_fn)

结果为:

  《二、PyTorch 入门实战—Variable(转)》                                                               

可以看到data就是Tensor的内容,剩下的两个属性为空

 

三、标量求导计算图

1.为了方便起见,我们可以将torch.autograd.Variable简写为Variable:

from torch.autograd import Variable

2.之后,我们先声明一个变量x,这里requires_grad=True意义是否对这个变量求梯度,默认的 Fa!se:

x = Variable(torch.Tensor([2]),requires_grad = True) print(x)

代码变为:

import torch #创建Variable
a = torch.autograd.Variable() print(a) b = torch.autograd.Variable(torch.Tensor([[1, 2], [3, 4],[5, 6], [7, 8]])) print(b) print(b.data) print(b.grad) print(b.grad_fn) #建立计算图
from torch.autograd import Variable x = Variable(torch.Tensor([2]),requires_grad = True) print(x)

结果为:

  《二、PyTorch 入门实战—Variable(转)》                                                                                                                                      

3.我们再声明两个变量w和b:

w = Variable(torch.Tensor([3]),requires_grad = True) print(w) b = Variable(torch.Tensor([4]),requires_grad = True) print(b)

4.我们再写两个变量y1和y2:

y1 = w * x + b print(y1) y2 = w * x + b * x print(y2)

5.我们来计算各个变量的梯度,首先是y1:

#计算梯度
y1.backward() print(x.grad) print(w.grad) print(b.grad)

代码变为:

import torch #创建Variable
a = torch.autograd.Variable() print(a) b = torch.autograd.Variable(torch.Tensor([[1, 2], [3, 4],[5, 6], [7, 8]])) print(b) print(b.data) print(b.grad) print(b.grad_fn) #建立计算图
from torch.autograd import Variable x = Variable(torch.Tensor([2]),requires_grad = True) print(x) w = Variable(torch.Tensor([3]),requires_grad = True) print(w) b = Variable(torch.Tensor([4]),requires_grad = True) print(b) y1 = w * x + b print(y1) y2 = w * x + b * x print(y2) #计算梯度
y1.backward() print(x.grad) print(w.grad) print(b.grad)

结果为:

   《二、PyTorch 入门实战—Variable(转)》                                                                                  

其中:

y1 = 3 * 2 + 4 = 10,

y2 = 3 * 2 + 4 * 2 = 14,

x的梯度是3因为是3 * x,

w的梯度是2因为w * 2,

b的梯度是1因为b * 1(* 1被省略)

6.其次是y2,注销y1部分:

y2.backward(x) print(x.grad) print(w.grad) print(b.grad) 代码为: import torch #创建Variable
a = torch.autograd.Variable() print(a) b = torch.autograd.Variable(torch.Tensor([[1, 2], [3, 4],[5, 6], [7, 8]])) print(b) print(b.data) print(b.grad) print(b.grad_fn) #建立计算图
from torch.autograd import Variable x = Variable(torch.Tensor([2]),requires_grad = True) print(x) w = Variable(torch.Tensor([3]),requires_grad = True) print(w) b = Variable(torch.Tensor([4]),requires_grad = True) print(b) y1 = w * x + b print(y1) y2 = w * x + b * x print(y2) #计算梯度 #y1.backward() #print(x.grad) #print(w.grad) #print(b.grad)
y2.backward() print(x.grad) print(w.grad) print(b.grad)

结果为:

   《二、PyTorch 入门实战—Variable(转)》                                                              

其中:

x的梯度是7因为是3 * x + 4 * x,

w的梯度是2因为w * 2,

b的梯度是2因为b * 2

7.backward的函数可以填入参数,例如我们填入变量a:

a = Variable(torch.Tensor([5]),requires_grad = True) y2.backward(a) print(x.grad) print(w.grad) print(b.grad)

代码变为:

import torch #创建Variable
a = torch.autograd.Variable() print(a) b = torch.autograd.Variable(torch.Tensor([[1, 2], [3, 4],[5, 6], [7, 8]])) print(b) print(b.data) print(b.grad) print(b.grad_fn) #建立计算图
from torch.autograd import Variable x = Variable(torch.Tensor([2]),requires_grad = True) print(x) w = Variable(torch.Tensor([3]),requires_grad = True) print(w) b = Variable(torch.Tensor([4]),requires_grad = True) print(b) y1 = w * x + b print(y1) y2 = w * x + b * x print(y2) #计算梯度 #y1.backward() #print(x.grad) #print(w.grad) #print(b.grad)
a = Variable(torch.Tensor([5]),requires_grad = True) y2.backward(a) print(x.grad) print(w.grad) print(b.grad)

结果为:

《二、PyTorch 入门实战—Variable(转)》                                                                 

可以看到x,w,b的梯度乘以了a的值5,说明这个填入参数是梯度的系数。

 

四、矩阵求导计算图

1.例如:

#矩阵求导
c = torch.randn(3) print(c) c = Variable(c,requires_grad = True) print(c) y3 = c * 2
print(y3) y3.backward(torch.FloatTensor([1, 0.1, 0.01])) print(c.grad)

代码变为:

import torch #创建Variable
a = torch.autograd.Variable() print(a) b = torch.autograd.Variable(torch.Tensor([[1, 2], [3, 4],[5, 6], [7, 8]])) print(b) print(b.data) print(b.grad) print(b.grad_fn) #建立计算图
from torch.autograd import Variable x = Variable(torch.Tensor([2]),requires_grad = True) print(x) w = Variable(torch.Tensor([3]),requires_grad = True) print(w) b = Variable(torch.Tensor([4]),requires_grad = True) print(b) y1 = w * x + b print(y1) y2 = w * x + b * x print(y2) #计算梯度 #y1.backward() #print(x.grad) #print(w.grad) #print(b.grad)
a = Variable(torch.Tensor([5]),requires_grad = True) y2.backward(a) print(x.grad) print(w.grad) print(b.grad) #矩阵求导
c = torch.randn(3) print(c) c = Variable(c,requires_grad = True) print(c) y3 = c * 2
print(y3) y3.backward(torch.FloatTensor([1, 0.1, 0.01])) print(c.grad)

结果为:

 《二、PyTorch 入门实战—Variable(转)》                                    

可以看到,c是一个1行3列的矩阵,因为y3 = c * 2,因此如果backward()里的参数为:

torch.FloatTensor([1, 1, 1])

则就是每个分量的梯度,但是传入的是:

torch.FloatTensor([1, 0.1, 0.01])

则每个分量梯度要分别乘以1,0.1和0.01

 

五、Variable放到GPU上执行

1.和Tensor一样的道理,代码如下:

#Variable放在GPU上
if torch.cuda.is_available(): d = c.cuda() print(d)

代码变为:

import torch #创建Variable
a = torch.autograd.Variable() print(a) b = torch.autograd.Variable(torch.Tensor([[1, 2], [3, 4],[5, 6], [7, 8]])) print(b) print(b.data) print(b.grad) print(b.grad_fn) #建立计算图
from torch.autograd import Variable x = Variable(torch.Tensor([2]),requires_grad = True) print(x) w = Variable(torch.Tensor([3]),requires_grad = True) print(w) b = Variable(torch.Tensor([4]),requires_grad = True) print(b) y1 = w * x + b print(y1) y2 = w * x + b * x print(y2) #计算梯度 #y1.backward() #print(x.grad) #print(w.grad) #print(b.grad)
a = Variable(torch.Tensor([5]),requires_grad = True) y2.backward(a) print(x.grad) print(w.grad) print(b.grad) #矩阵求导
c = torch.randn(3) print(c) c = Variable(c,requires_grad = True) print(c) y3 = c * 2
print(y3) y3.backward(torch.FloatTensor([1, 0.1, 0.01])) print(c.grad) #Variable放在GPU上
if torch.cuda.is_available(): d = c.cuda() print(d)

2.生成结果会慢一下,然后可以看到多了一个device=‘cuda:0’和grad_fn=<CopyBackwards>

   《二、PyTorch 入门实战—Variable(转)》                            

 

六、Variable转Numpy与Numpy转Variable

1.值得注意的是,Variable里requires_grad 一般设置为 False,代码中为True则:

#变量转Numpy
e = Variable(torch.Tensor([4]),requires_grad = True) f = e.numpy() print(f)

会报如下错误:

Can't call numpy() on Variable that requires grad. Use var.detach().numpy() instead.

2.解决方法1:requires_grad改为False后,可以看到最后一行的Numpy类型的矩阵[4.]:

  《二、PyTorch 入门实战—Variable(转)》                           

3.解决方法2::将numpy()改为detach().numpy(),可以看到最后一行的Numpy类型的矩阵[4.]

#变量转Numpy
e = Variable(torch.Tensor([4]),requires_grad = True) f = e.detach().numpy() print(f)

代码变为:

import torch #创建Variable
a = torch.autograd.Variable() print(a) b = torch.autograd.Variable(torch.Tensor([[1, 2], [3, 4],[5, 6], [7, 8]])) print(b) print(b.data) print(b.grad) print(b.grad_fn) #建立计算图
from torch.autograd import Variable x = Variable(torch.Tensor([2]),requires_grad = True) print(x) w = Variable(torch.Tensor([3]),requires_grad = True) print(w) b = Variable(torch.Tensor([4]),requires_grad = True) print(b) y1 = w * x + b print(y1) y2 = w * x + b * x print(y2) #计算梯度 #y1.backward() #print(x.grad) #print(w.grad) #print(b.grad)
a = Variable(torch.Tensor([5]),requires_grad = True) y2.backward(a) print(x.grad) print(w.grad) print(b.grad) #矩阵求导
c = torch.randn(3) print(c) c = Variable(c,requires_grad = True) print(c) y3 = c * 2
print(y3) y3.backward(torch.FloatTensor([1, 0.1, 0.01])) print(c.grad) #Variable放在GPU上
if torch.cuda.is_available(): d = c.cuda() print(d) #变量转Numpy
e = Variable(torch.Tensor([4]),requires_grad = True) f = e.detach().numpy() print(f)

结果为: 

《二、PyTorch 入门实战—Variable(转)》                              

4.Numpy转Variable先是转为Tensor再转为Variable:

#转换为Tensor
g = torch.from_numpy(f) print(g) #转换为Variable
g = Variable(g,requires_grad = True) print(g)

代码变为:

import torch #创建Variable
a = torch.autograd.Variable() print(a) b = torch.autograd.Variable(torch.Tensor([[1, 2], [3, 4],[5, 6], [7, 8]])) print(b) print(b.data) print(b.grad) print(b.grad_fn) #建立计算图
from torch.autograd import Variable x = Variable(torch.Tensor([2]),requires_grad = True) print(x) w = Variable(torch.Tensor([3]),requires_grad = True) print(w) b = Variable(torch.Tensor([4]),requires_grad = True) print(b) y1 = w * x + b print(y1) y2 = w * x + b * x print(y2) #计算梯度 #y1.backward() #print(x.grad) #print(w.grad) #print(b.grad)
a = Variable(torch.Tensor([5]),requires_grad = True) y2.backward(a) print(x.grad) print(w.grad) print(b.grad) #矩阵求导
c = torch.randn(3) print(c) c = Variable(c,requires_grad = True) print(c) y3 = c * 2
print(y3) y3.backward(torch.FloatTensor([1, 0.1, 0.01])) print(c.grad) #Variable放在GPU上
if torch.cuda.is_available(): d = c.cuda() print(d) #变量转Numpy
e = Variable(torch.Tensor([4]),requires_grad = True) f = e.detach().numpy() print(f) #转换为Tensor
g = torch.from_numpy(f) print(g) #转换为Variable
g = Variable(g,requires_grad = True) print(g)

结果为: 

《二、PyTorch 入门实战—Variable(转)》                              

 

七、Variable总结

1.Variable和Tensor本质上没有区别,不过Variable会被放入一个计算图中,然后进行前向传播,反向传播,自动求导。

2.Variable有三个属性,可以通过构造函数结构求取梯度得到grad值和grad_fn值

3.Variable,Tensor和Numpy互相转化很方便,类型也比较兼容

    原文作者:pytorch
    原文地址: https://www.cnblogs.com/chen8023miss/p/11201348.html
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞