从实例掌握 pytorch 进行图像分类

背景
从入门 Tensorflow 到沉迷 keras 再到跳出安逸选择pytorch,根本原因是在参加天池雪浪AI制造数据竞赛的时候,几乎同样的网络模型和参数,以及相似的数据预处理方式,结果得到的成绩差距之大让我无法接受,故转为 pytorch,keras 只用来做一些 NLP 的项目(毕竟积累了一些”祖传模型”)~

注:本项目以 交通标志数据集 为例,需要的可以进行下载 traffic-sign,完整代码地址:pytorch-image-classification

更新 :2018年10月22日第二次更新,版本 0.1.1

更改:

  1. 数据增强方式由 pytorch 内置方式改为自定义,便于后期多 channels 模型更改,同时也可以借用 opencv 的强大库进行数据预处理(pytorch 的数据读取采用的是 PIL 库)。

  2. 输出打印方式采用 logger 的形式,动态更新。

  3. 保存最优模型的方式采用半个 epoch 计算一次

pytorch 0.4.0

<h5 id=”1″>0. 图像分类框架结构</h5>

在我们学习完机器学习深度学习卷积神经网络以及结构化机器学习项目等理论知识后,如何动手完成一个实际的项目往往是一个瓶颈期,只有将所学知识灵活运用,才敢说自己学了这些。前面的那些课程,我在研一上学期的时候都学习过,但直到研一下开始实习后,才逐渐能够独立完成项目,甚至参加一些数据竞赛。

在我使用 pytorch 的过程中,将其分为七大部分:数据加载,模型定义,评测标准定义,训练过程定义,验证过程定义,测试过程定义,参数定义

文件组织如下:

==============================================================

  • checkpoints/
    • bestmodels/
  • dataset/
    • aug.py
    • dataloader.py
  • logs/
  • models/
    • pretrained_models/
    • model.py
  • submit/
  • config.py
  • main.py
  • utils.py

==============================================================

  • checkpoints/ : 存放训练保存的模型( bestmodels/ 保存在验证集上效果最好的模型);
  • models/ : 存放一些自定义的模型,如果不想使用 pytorch 自定义的网络模型,可以在这里添加(记得添加__init__.py文件);
  • submit/ : 输出的预测文件或者说比赛所需要你提交的结果文件,常见的是csv格式的;
  • logs/: 存放记录训练日志(.txt格式文件)
  • dataset/:包含 aug.py dataloader.py 两文件,主要实现数据增强和数据加载两个功能
  • config.py: 参数定义文件,以参数类的形式定义所需要提前设定或者修改的参数,例如:数据路径,学习率,训练 epoch 等;
  • model.py: 定义模型加载,可有可无,为了方便进行模型的 fine tune 我喜欢单独列出来;
  • utils.py: 定义了一些常用的评测标准,比如 mAP,Accuracy,loss 等。
  • main.py: 主文件,包含训练、测试、验证等过程;
    <h5 id=”2″>1. 参数定义: config.py</h5>

参数定义的方式有很多种,有的人喜欢直接在主文件中进行设置;有的喜欢用 argparse 这个模块;也有人喜欢用 json 格式的文件,但是总的来说都不够简洁,我个人喜欢单独创建个 config.py 然后创建个 Python 类,以类属性的形式定义参数,详情见下:

class DefaultConfigs(object):
    #1.string parameters
    train_data = "../data/train/"
    test_data = ""
    val_data = "../data/val/"
    model_name = "resnet50"
    weights = "./checkpoints/"
    best_models = weights + "best_model/"
    submit = "./submit/"
    logs = "./logs/"
    gpus = "1"

    #2.numeric parameters
    epochs = 40
    batch_size = 4
    img_height = 224
    img_weight = 224
    num_classes = 62
    seed = 888
    lr = 1e-3
    lr_decay = 1e-4
    weight_decay = 1e-4

config = DefaultConfigs()

<h5 id=”3″>2. 数据加载: data_loader.py</h5>

pytorch 的数据读取方式有两种,一种是不同类别的图像按照文件夹进行划分,比如交通标志数据集:

  • train/
    • 00000/
      • 01153_00000.png
      • 01153_00001.png
    • 00001/
      • 00025_00000.png
      • 00025_00001.png
train_data = torchvision.datasets.ImageFolder(
    "/data2/dockspace_zcj/traffic-sign/train/",#图片文件存放路径
    transform = None #定义的数据增强方式
)
data_loader = torch.utils.data.DataLoader(train_data, 
            batch_size=20,
            shuffle=True
)
""" 
在模型训练过程时只需要加载data_loader就可以了,
具体方式在main文件中可见
"""

aug.py 由于代码较多,不在此展示,详情请移步 github 。常用的增强方式均在此文件中列举出来,如果需要添加,可根据样例,自行添加。

因此采用继承 torch.utils.data.Dataset 类,新建一个数据加载的 python 类,在__get_item__(self,index)函数中添加数据增强,代码如下:

from torch.utils.data import Dataset
from torchvision import transforms as T 
from config import config
from PIL import Image 
from dataset.aug import *
from itertools import chain 
from glob import glob
from tqdm import tqdm
import random 
import numpy as np 
import pandas as pd 
import os 
import cv2
import torch 

#1.set random seed
random.seed(config.seed)
np.random.seed(config.seed)
torch.manual_seed(config.seed)
torch.cuda.manual_seed_all(config.seed)

#2.define dataset
class ChaojieDataset(Dataset):
    def __init__(self,label_list,transforms=None,train=True,test=False):
        self.test = test 
        self.train = train 
        imgs = []
        if self.test:
            for index,row in label_list.iterrows():
                imgs.append((row["filename"]))
            self.imgs = imgs 
        else:
            for index,row in label_list.iterrows():
                imgs.append((row["filename"],row["label"]))
            self.imgs = imgs
        if transforms is None:
            if self.test or not train:
                self.transforms = Compose([
                    Resize((config.img_weight,config.img_height)),
                    Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
                ])
            else:
                self.transforms = Compose([
                    Resize((config.img_weight,config.img_height)),
                    FixRandomRotate(bound='Random'),
                    RandomHflip(),
                    RandomVflip(),
                    Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
                ])
        else:
            self.transforms = transforms
    def __getitem__(self,index):
        if self.test:
            filename = self.imgs[index]
            img = cv2.imread(filename)
            img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
            img = self.transforms(img)
            return torch.from_numpy(img).float(),filename
        else:
            filename,label = self.imgs[index] 
            img = cv2.imread(filename)
            img = cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
            img = self.transforms(img)
            return torch.from_numpy(img).float(),label
    def __len__(self):
        return len(self.imgs)

def collate_fn(batch):
    imgs = []
    label = []
    for sample in batch:
        imgs.append(sample[0])
        label.append(sample[1])

    return torch.stack(imgs, 0), \
           label

def get_files(root,mode):
    #for test
    if mode == "test":
        files = []
        for img in os.listdir(root):
            files.append(root + img)
        files = pd.DataFrame({"filename":files})
        return files
    elif mode != "test": 
        #for train and val       
        all_data_path,labels = [],[]
        image_folders = list(map(lambda x:root+x,os.listdir(root)))
        all_images = list(chain.from_iterable(list(map(lambda x:glob(x+"/*.png"),image_folders))))
        print("loading train dataset")
        for file in tqdm(all_images):
            all_data_path.append(file)
            labels.append(int(file.split("/")[-2]))
        all_files = pd.DataFrame({"filename":all_data_path,"label":labels})
        return all_files
    else:
        print("check the mode please!")

注:定义的 get_files(root,mode) 函数是为了使用 pandas 的读取方式,便于在没有提供验证集的数据集上对训练集进行随机划分,主要体现在平衡随机划分的数据上。

<h5 id=”4″>3. 数据加载: model.py</h5>

创建这个文件夹的原因一是因为太多的代码放到主文件中显得过于臃肿,另外也不利于修改模型进行 fine tune ,以 resnet101 为例:

import torchvision
import torch.nn.functional as F 
from torch import nn
from config import config

def get_net():
    #return MyModel(torchvision.models.resnet101(pretrained = True))
    model = torchvision.models.resnet101(pretrained = True)
    model.avgpool = nn.AdaptiveAvgPool2d(1)
    model.fc = nn.Linear(2048,config.num_classes)
    return model

<h5 id=”5″>4. 评价指标: utils.py</h5>

pytrorch 不像 keras 那样把一些模型评价指标都给封装好了,只需要在 fit 的过程中附上 metrics=[acc] 就可以了,其他的指标添加一下即可。当然也可以利用 torchnet 这个模块,但是我在使用过程中发现想要的指标没有,就自己定义了,其实大差不差。

class AverageMeter(object):
    """Computes and stores the average and current value"""

    def __init__(self):
        self.reset()

    def reset(self):
        self.val = 0
        self.avg = 0
        self.sum = 0
        self.count = 0

    def update(self, val, n=1):
        self.val = val
        self.sum += val * n
        self.count += n
        self.avg = self.sum / self.count


def accuracy(y_pred, y_actual, topk=(1, )):
    """Computes the precision@k for the specified values of k"""
    maxk = max(topk)
    batch_size = y_actual.size(0)

    _, pred = y_pred.topk(maxk, 1, True, True)
    pred = pred.t()
    correct = pred.eq(y_actual.view(1, -1).expand_as(pred))

    res = []
    for k in topk:
        correct_k = correct[:k].view(-1).float().sum(0)
        res.append(correct_k.mul_(100.0 / batch_size))

    return res

<h5 id=”6″>5. 主要文件: main.py</h5>

之所以要自己定义训练、验证和测试函数,就是因为 pytorch 没有封装好,需要我们自己来设定,详细内容在代码中有注释,如果有疑问可以联系我

# -*- coding: utf-8 -*-
# @Time    : 2018/7/31 09:41
# @Author  : Spytensor
# @File    : main.py
# @Email   : zhuchaojie@buaa.edu.cn
#====================================================
#               定义模型训练/验证/预测等                     
#====================================================
import os 
import random 
import time
import json
import torch
import torchvision
import numpy as np 
import pandas as pd 
import warnings
from datetime import datetime
from torch import nn,optim
from config import config 
from collections import OrderedDict
from torch.autograd import Variable 
from torch.utils.data import DataLoader
from dataset.dataloader import *
from sklearn.model_selection import train_test_split,StratifiedKFold
from timeit import default_timer as timer
from models.model import *
from utils import *

#1. set random.seed and cudnn performance
random.seed(config.seed)
np.random.seed(config.seed)
torch.manual_seed(config.seed)
torch.cuda.manual_seed_all(config.seed)
os.environ["CUDA_VISIBLE_DEVICES"] = config.gpus
torch.backends.cudnn.benchmark = True
warnings.filterwarnings('ignore')

#2. evaluate func
def evaluate(val_loader,model,criterion):
    #2.1 define meters
    losses = AverageMeter()
    top1 = AverageMeter()
    top2 = AverageMeter()
    #2.2 switch to evaluate mode and confirm model has been transfered to cuda
    model.cuda()
    model.eval()
    with torch.no_grad():
        for i,(input,target) in enumerate(val_loader):
            input = Variable(input).cuda()
            target = Variable(torch.from_numpy(np.array(target)).long()).cuda()

            #2.2.1 compute output
            output = model(input)
            loss = criterion(output,target)

            #2.2.2 measure accuracy and record loss
            precision1,precision2 = accuracy(output,target,topk=(1,2))
            losses.update(loss.item(),input.size(0))
            top1.update(precision1[0],input.size(0))
            top2.update(precision2[0],input.size(0))

    return [losses.avg,top1.avg,top2.avg]

#3. test model on public dataset and save the probability matrix
def test(test_loader,model,folds):
    #3.1 confirm the model converted to cuda
    csv_map = OrderedDict({"filename":[],"probability":[]})
    model.cuda()
    model.eval()
    for i,(input,filepath) in enumerate(tqdm(test_loader)):
        #3.2 change everything to cuda and get only basename
        filepath = [os.path.basename(x) for x in filepath]
        with torch.no_grad():
            image_var = Variable(input).cuda()
            #3.3.output
            #print(filepath)
            #print(input,input.shape)
            y_pred = model(image_var)
            print(y_pred.shape)
            smax = nn.Softmax(1)
            smax_out = smax(y_pred)
        #3.4 save probability to csv files
        csv_map["filename"].extend(filepath)
        for output in smax_out:
            prob = ";".join([str(i) for i in output.data.tolist()])
            csv_map["probability"].append(prob)
    result = pd.DataFrame(csv_map)
    result["probability"] = result["probability"].map(lambda x : [float(i) for i in x.split(";")])
    result.to_csv("./submit/{}_submission.csv" .format(config.model_name + "_" + str(folds)),index=False,header = None)

#4. more details to build main function    
def main():
    fold = 0
    #4.1 mkdirs
    if not os.path.exists(config.submit):
        os.mkdir(config.submit)
    if not os.path.exists(config.weights):
        os.mkdir(config.weights)
    if not os.path.exists(config.best_models):
        os.mkdir(config.best_models)
    if not os.path.exists(config.logs):
        os.mkdir(config.logs)
    if not os.path.exists(config.weights + config.model_name + os.sep +str(fold) + os.sep):
        os.makedirs(config.weights + config.model_name + os.sep +str(fold) + os.sep)
    if not os.path.exists(config.best_models + config.model_name + os.sep +str(fold) + os.sep):
        os.makedirs(config.best_models + config.model_name + os.sep +str(fold) + os.sep)       
    #4.2 get model and optimizer
    model = get_net()
    model = torch.nn.DataParallel(model)
    model.cuda()
    optimizer = optim.SGD(model.parameters(),lr = config.lr,momentum=0.9,weight_decay=config.weight_decay)
    #optimizer = optim.Adam(model.parameters(),lr = config.lr,amsgrad=True,weight_decay=config.weight_decay)
    criterion = nn.CrossEntropyLoss().cuda()
    log = Logger()
    log.open(config.logs + "log_train.txt",mode="a")
    log.write("\n------------------------------------ [START %s] %s\n\n" % (datetime.now().strftime('%Y-%m-%d %H:%M:%S'), '-' * 40))
    #4.3 some parameters for  K-fold and restart model
    start_epoch = 0
    best_precision1 = 0
    resume = False
    
    #4.4 restart the training process
    if resume:
        checkpoint = torch.load(config.best_models + str(fold) + "/model_best.pth.tar")
        start_epoch = checkpoint["epoch"]
        fold = checkpoint["fold"]
        best_precision1 = checkpoint["best_precision1"]
        model.load_state_dict(checkpoint["state_dict"])
        optimizer.load_state_dict(checkpoint["optimizer"])

    #4.5 get files and split for K-fold dataset
    #4.5.1 read files
    train_data_list = get_files(config.train_data,"train")
    val_data_list = get_files(config.val_data,"val")
    #test_files = get_files(config.test_data,"test")

    """ 
    #如果没有提供验证集,可在此进行划分
    #4.5.2 split
    split_fold = StratifiedKFold(n_splits=3)
    folds_indexes = split_fold.split(X=origin_files["filename"],y=origin_files["label"])
    folds_indexes = np.array(list(folds_indexes))
    fold_index = folds_indexes[fold]

    #4.5.3 using fold index to split for train data and val data
    train_data_list = pd.concat([origin_files["filename"][fold_index[0]],origin_files["label"][fold_index[0]]],axis=1)
    val_data_list = pd.concat([origin_files["filename"][fold_index[1]],origin_files["label"][fold_index[1]]],axis=1)
    """
    #train_data_list,val_data_list = train_test_split(origin_files,test_size = 0.1,stratify=origin_files["label"])
    #4.5.4 load dataset
    train_dataloader = DataLoader(ChaojieDataset(train_data_list),batch_size=config.batch_size,shuffle=True,collate_fn=collate_fn,pin_memory=True)
    val_dataloader = DataLoader(ChaojieDataset(val_data_list,train=False),batch_size=config.batch_size * 2,shuffle=True,collate_fn=collate_fn,pin_memory=False)
    #test_dataloader = DataLoader(ChaojieDataset(test_files,test=True),batch_size=1,shuffle=False,pin_memory=False)
    #scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer,"max",verbose=1,patience=3)
    scheduler =  optim.lr_scheduler.StepLR(optimizer,step_size = 5,gamma=0.1)
    #4.5.5.1 define metrics
    train_losses = AverageMeter()
    train_top1 = AverageMeter()
    train_top2 = AverageMeter()
    valid_loss = [np.inf,0,0]
    model.train()

    #logs
    log.write('** start training here! **\n')
    log.write('                               |------------ VALID -------------|----------- TRAIN -------------|         \n')
    log.write('lr           iter     epoch    | loss   top-1  top-2            | loss   top-1  top-2           |  time   \n')
    log.write('----------------------------------------------------------------------------------------------------\n')
    #4.5.5 train
    start = timer()
    for epoch in range(start_epoch,config.epochs):
        scheduler.step(epoch)
        #4.5.5.2 train
        for iter,(input,target) in enumerate(train_dataloader):

            lr = get_learning_rate(optimizer)
            #evaluate every half epoch
            if iter == len(train_dataloader) // 2:
                valid_loss = evaluate(val_dataloader,model,criterion)
                is_best = valid_loss[1] > best_precision1
                best_precision1 = max(valid_loss[1],best_precision1)
                save_checkpoint({
                    "epoch":epoch + 1,
                    "model_name":config.model_name,
                    "state_dict":model.state_dict(),
                    "best_precision1":best_precision1,
                    "optimizer":optimizer.state_dict(),
                    "fold":fold,
                    "valid_loss":valid_loss,
                },is_best,fold)
                #adjust learning rate
                #scheduler.step(valid_loss[1])
                print("\r",end="",flush=True)
                log.write('%0.8f %5.1f   %6.1f      | %0.3f  %0.3f  %0.3f        | %0.3f  %0.3f  %0.3f        | %s' % (\
                        lr, iter/len(train_dataloader) + epoch, epoch,
                        valid_loss[0], valid_loss[1], valid_loss[2],
                        train_losses.avg,    train_top1.avg,    train_top2.avg, 
                        time_to_str((timer() - start),'min'))
                )
                log.write('\n')
                time.sleep(0.01)

            #4.5.5 switch to continue train process
            #scheduler.step(epoch)
            model.train()
            input = Variable(input).cuda()
            target = Variable(torch.from_numpy(np.array(target)).long()).cuda()
            output = model(input)
            loss = criterion(output,target)

            precision1_train,precision2_train = accuracy(output,target,topk=(1,2))
            train_losses.update(loss.item(),input.size(0))
            train_top1.update(precision1_train[0],input.size(0))
            train_top2.update(precision2_train[0],input.size(0))
            #backward
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
            lr = get_learning_rate(optimizer)
            print('\r',end='',flush=True)
            print('%0.8f %5.1f   %6.1f      | %0.3f  %0.3f  %0.3f       | %0.3f  %0.3f  %0.3f        | %s' % (\
                         lr, iter/len(train_dataloader) + epoch, epoch,
                         valid_loss[0], valid_loss[1], valid_loss[2],
                         train_losses.avg, train_top1.avg, train_top2.avg,
                         time_to_str((timer() - start),'min'))
            , end='',flush=True)
    # best_model = torch.load(config.best_models + os.sep+ str(fold) + 'model_best.pth.tar')
    # model.load_state_dict(best_model["state_dict"])
    # test(test_dataloader,model,fold)

if __name__ =="__main__":
    main()

<h5 id=”7″>6. 训练结果</h5>

------------------------------------ [START 2018-10-22 19:47:48] ----------------------------------------

loading train dataset
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4572/4572 [00:00<00:00, 589769.58it/s]
loading train dataset
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2520/2520 [00:00<00:00, 603496.98it/s]
** start training here! **
                               |------------ VALID -------------|----------- TRAIN -------------|         
lr           iter     epoch    | loss   top-1  top-2            | loss   top-1  top-2           |  time   
----------------------------------------------------------------------------------------------------
0.00010000   0.5      0.0      | 0.578  82.063  91.706          | 1.661  63.354  72.242          |  0 hr 01 min
0.00010000   1.5      1.0      | 0.254  93.532  96.270          | 0.936  78.442  85.356          |  0 hr 04 min
0.00010000   2.5      2.0      | 0.226  94.563  97.619          | 0.691  83.567  89.771          |  0 hr 06 min
0.00010000   3.5      3.0      | 0.186  91.944  97.976          | 0.551  86.738  92.206          |  0 hr 09 min
0.00010000   4.5      4.0      | 0.214  95.357  99.087          | 0.461  88.771  93.700          |  0 hr 11 min
0.00010000   5.5      5.0      | 0.111  97.222  99.246          | 0.399  90.161  94.699          |  0 hr 14 min

<h5 id="8">7. 总结</h5>

无论使用哪种框架,自己用起来舒服才是最好的,因为 `pytorch` 相比 `keras` `tensorflow` 而言还不够完善,存在一些难以理解的 `bug` 所以最好能够对应版本去使用,整个项目使用的是 `pytorch 0.4.0` 。最后声明一下,本篇文章是我在做图像分类问题时,自己整合多份代码,最后完成的,按照自己的需要增添了一些模块,具体参考的代码在参考文献中给出。
另外,附上完整代码地址:[pytorch-image-classification](https://github.com/spytensor/pytorch-image-classification)!

<h5 id="9">8. 参考文献</h5>

- [pytorch-classification](https://github.com/bearpaw/pytorch-classification)
- [pytorch-best-practice](https://github.com/chenyuntc/pytorch-best-practice)

    原文作者:Spytensor
    原文地址: https://www.jianshu.com/p/84297d9e4882
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞