固定点算法中的内存分配

2019年8月6日 312次阅读

我需要找到函数f的固定点.算法很简单：

>给定X,计算f(X)
>如果|| X-f(X)||低于一定的公差,退出并返回X,
否则将X设置为等于f(X)并返回1.

我想确保我不会在每次迭代时为新对象分配内存

目前,算法如下所示：

iter1 = function(x::Vector{Float64})
    for iter in 1:max_it
        oldx = copy(x)
        g1(x)
        delta = vnormdiff(x, oldx, 2)
        if delta < tolerance
            break
        end
    end
end

这里g1(x)是一个将x设置为f(x)的函数

但似乎这个循环在每个循环中分配一个新的向量(见下文).

编写算法的另一种方法如下：

iter2 = function(x::Vector{Float64})
    oldx = similar(x)
    for iter in 1:max_it
        (oldx, x) = (x, oldx)
        g2(x, oldx)
        delta = vnormdiff(oldx, x, 2)
        if delta < tolerance
            break
        end
    end
end

其中g2(x1,x2)是将x1设置为f(x2)的函数.

这是编写这种迭代问题的最有效和最自然的方式吗？

Edit1：时序显示第二个代码更快：

using NumericExtensions
max_it = 1000
tolerance = 1e-8
max_it = 100

g1 = function(x::Vector{Float64}) 
    for i in 1:length(x)
        x[i] = x[i]/2
    end
end

g2 = function(newx::Vector{Float64}, x::Vector{Float64}) 
    for i in 1:length(x)
        newx[i] = x[i]/2
    end
end

x = fill(1e7, int(1e7))
@time iter1(x)
# elapsed time: 4.688103075 seconds (4960117840 bytes allocated, 29.72% gc time)
x = fill(1e7, int(1e7))
@time iter2(x)
# elapsed time: 2.187916177 seconds (80199676 bytes allocated, 0.74% gc time)

Edit2：使用副本！

iter3 = function(x::Vector{Float64})
    oldx = similar(x)
    for iter in 1:max_it
        copy!(oldx, x)
        g1(x)
        delta = vnormdiff(x, oldx, 2)
        if delta < tolerance
            break
        end
    end
end
x = fill(1e7, int(1e7))
@time iter3(x)
# elapsed time: 2.745350176 seconds (80008088 bytes allocated, 1.11% gc time)

最佳答案我想在第一个代码中替换以下行

for iter = 1:max_it
    oldx = copy( x )
    ...

通过

oldx = zeros( N )
for iter = 1:max_it
    oldx[:] = x    # or copy!( oldx, x )
    ...

将更有效,因为没有分配数组.此外,通过显式写入for循环可以使代码更高效.例如,这可以从以下比较中看出

function test()
    N = 1000000

    a = zeros( N )
    b = zeros( N )

    @time c = copy( a )

    @time b[:] = a

    @time copy!( b, a )

    @time \
    for i = 1:length(a)
        b[i] = a[i]
    end

    @time \
    for i in eachindex(a)
        b[i] = a[i]
    end
end

test()

在Linux(x86_64)上使用Julia0.4.0获得的结果是

elapsed time: 0.003955609 seconds (7 MB allocated)
elapsed time: 0.001279142 seconds (0 bytes allocated)
elapsed time: 0.000836167 seconds (0 bytes allocated)
elapsed time: 1.19e-7 seconds (0 bytes allocated)
elapsed time: 1.28e-7 seconds (0 bytes allocated)

复制！()似乎比在左侧使用[：]更快,
虽然在重复计算中差异变得微不足道(似乎有
第一次[：]计算的一些开销.顺便说一句,使用eachindex()的最后一个例子非常便于循环多维数组.

可以对vnormdiff()进行类似的比较,其中norm(x-oldx)等的使用比向量范数的显式循环慢,因为前者为x-oldx分配一个临时数组.