python – 如何在numpy中为CNN实现反卷积层？

2023年3月21日 336次阅读

我尝试为卷积网络实现解卷积层.我的意思是去卷积是假设我有3x227x227输入图像到一个过滤器尺寸为3x11x11和步幅4的图层.因此得到的特征图的大小为55×55.我尝试做的是应用反向操作,我将55×55特征映射再次投影到3x227x227图像.基本上,55×55特征图上的每个值都由3x11x11滤镜加权并投影到图像空间,并且由于步幅而对重叠区域进行平均.

我尝试在numpy中实现它而没有任何成功.我找到了一个蛮力嵌套for循环的解决方案,但它很慢.如何有效地实现numpy？欢迎任何帮助.

最佳答案正如
this question中所讨论的,反卷积只是一个卷积层,但具有特定的填充,步幅和滤波器尺寸选择.

例如,如果当前图像大小为55×55,则可以应用padding = 20,stride = 1和filter = [21×21]的卷积来获得75×75图像,然后使用95×95等等. (我不是说这个数字的选择给出了输出图像所需的质量,只是尺寸.实际上,我认为从227×227到55×55的下采样然后再采样到227×227太过激进,但你可以自由尝试任何架构).

这是任何步幅和填充的正向传递的实现.它确实是im2col transformation,但是使用来自numpy的stride_tricks.它没有现代GPU实现那么优化,但肯定比4 inner loops更快：

import numpy as np

def conv_forward(x, w, b, stride, pad):
  N, C, H, W = x.shape
  F, _, HH, WW = w.shape

  # Check dimensions
  assert (W + 2 * pad - WW) % stride == 0, 'width does not work'
  assert (H + 2 * pad - HH) % stride == 0, 'height does not work'

  # Pad the input
  p = pad
  x_padded = np.pad(x, ((0, 0), (0, 0), (p, p), (p, p)), mode='constant')

  # Figure out output dimensions
  H += 2 * pad
  W += 2 * pad
  out_h = (H - HH) / stride + 1
  out_w = (W - WW) / stride + 1

  # Perform an im2col operation by picking clever strides
  shape = (C, HH, WW, N, out_h, out_w)
  strides = (H * W, W, 1, C * H * W, stride * W, stride)
  strides = x.itemsize * np.array(strides)
  x_stride = np.lib.stride_tricks.as_strided(x_padded,
                                             shape=shape, strides=strides)
  x_cols = np.ascontiguousarray(x_stride)
  x_cols.shape = (C * HH * WW, N * out_h * out_w)

  # Now all our convolutions are a big matrix multiply
  res = w.reshape(F, -1).dot(x_cols) + b.reshape(-1, 1)

  # Reshape the output
  res.shape = (F, N, out_h, out_w)
  out = res.transpose(1, 0, 2, 3)
  out = np.ascontiguousarray(out)
  return out