c – Caffe的Softmax层如何计算概率值？

2023年11月29日 161次阅读

有谁知道在Caffe softmax层内发生了什么计算？

我正在使用预先训练好的网络,最后是softmax图层.

在测试阶段,对于图像的简单前进,倒数第二层(“InnerProduct”)的输出如下：
-0.20095,0.39989,0.22510,-0.36796,-0.21991,0.43291,-0.22714,-0.22229,-0.08174,0.01931,-0.05791,0.21699,0.00437,-0.02350,0.02924,-0.28733,0.19157,-0.04191,-0.07360,0.30252

最后一层(“Softmax”)输出是以下值：
0.00000,0.44520,0.01115,0.00000,0.00000,0.89348,0.00000,0.00000,0.00002,0.00015,0.00003,0.00440,0.00011,0.00006,0.00018,0.00000,0.00550,0.00004,0.00002,0.05710

如果我在内积层的输出上应用Softmax(使用外部工具,如matlab),我会得到以下值：
0.0398,0.0726,0.0610,0.0337,0.0391,0.0751,0.0388,0.0390,0.0449,0.0496,0.0460,0.0605,0.0489,0.0476,0.0501,0.0365,0.0590,0.0467,0.0452,0.0659

后者对我有意义,因为概率加起来为1.0(注意Caffe的Softmax层值的总和> 1.0).

显然,Caffe中的softmax层不是一个直接的Softmax操作.

(我认为它没有任何区别,但我只会提到我正在使用预先训练过的flickr风格网络,请参阅说明here).

编辑：

这是proto txt中最后两层的定义.请注意,最后一层的类型是“Softmax”.

layer {
  name: "fc8_flickr"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8_flickr"
  param {
    lr_mult: 10
    decay_mult: 1
  }
  param {
    lr_mult: 20
    decay_mult: 0
  }
  inner_product_param {
    num_output: 20
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "prob"
  type: "Softmax"
  bottom: "fc8_flickr"
  top: "prob"
}

最佳答案你得到的结果很奇怪.

“Softmax”层的前向方法执行的操作是：

> computing the maximal value of the input vector
> subtract the maximal value from all elements in the vector
> exponent all values
> sum the exponents
> divide (scale) all exponented values by the sum.

(注意,执行前两个步骤是为了防止计算中的溢出).