python – Keras BatchNormalization,样本明智规范化究竟是什么?

我想弄清楚Keras的批量标准化到底是什么.现在我有以下代码.

for i in range(8):
    c = Convolution2D(128, 3, 3, border_mode = 'same', init = 'he_normal')(c)
    c = LeakyReLU()(c)
    c = Convolution2D(128, 3, 3, border_mode = 'same', init = 'he_normal')(c)
    c = LeakyReLU()(c)
    c = Convolution2D(128, 3, 3, border_mode = 'same', init = 'he_normal')(c)
    c = LeakyReLU()(c)
    c = merge([c, x], mode = 'sum')
    c = BatchNormalization(mode = 1)(c)
    x = c

我将批量标准模式设置为1,根据Keras文档1:样本标准化.此模式假定为2D输入.

我认为这应该做的只是将批次中的每个样本标准化,而不是每个其他样本.但是,当我查看调用函数的源代码时,我看到以下内容.

    elif self.mode == 1:
        # sample-wise normalization
        m = K.mean(x, axis=-1, keepdims=True)
        std = K.std(x, axis=-1, keepdims=True)
        x_normed = (x - m) / (std + self.epsilon)
        out = self.gamma * x_normed + self.beta

在这里,它只计算所有x的平均值,在我看来是(BATCH_SIZE,128,56,56).在模式1中,我认为它应该独立于批处理中的其他样本进行标准化.所以不应该轴= 1?在文档中还有什么“假设2D输入”?

最佳答案

In this it is just computing the mean over all of x which in my case is (BATCH_SIZE, 128, 56, 56) I think.

通过这样做,您已经违反了该图层的合同.这不是二维而是四维输入.

I thought it was supposed to normalize independent of the other samples in the batch when in mode 1

确实如此. K.mean(…,axis = -1)正在减小轴-1,它与输入的最后一个轴同义.因此,假设输入形状为(batchsz,features),轴-1将成为特征轴.

由于K.mean与numpy.mean非常相似,你可以自己测试一下:

>>> x = [[1,2,3],[4,5,6]]
>>> x
array([[1, 2, 3],
       [4, 5, 6]])
>>> np.mean(x, axis=-1)
array([ 2.,  5.])

您可以看到批次中每个样本的特征都减少了.

点赞