算法导论

2019年4月17日 376次阅读来源: 舰长同学

分析

loop-invariant

• A technique to prove that an algorithm is correct. Identify a loop-invariant.

• Initialization: It is true prior to the first iteration of a loop.
• 初始化：循环的第一次迭代之前，为真
• Maintenance: If it is true before an iteration of the loop, it remains true before the next iteration.
• 保持：如果循坏的某次迭代之前 为真，那么下次迭代之前它仍为真
• Termination: when the loop terminates, the invariant givesus a useful property that shows that the algorithm is correct.
• 终止：在循环终止时，不变式为我们提供一个有用的性质，该性质有助于证明算法的正确性。

RAM

We shall assume a generic one-processor, random-access machine (RAM) model of computation.
我们将假设一个通用的单处理器随机存取机（RAM）计算模型

-- Instructions are executed one after another, with no concurrent operations.
-- Each time, an instruction of a program is executed as an atom operation. An instruction includes arithmetic operations, logical operations, data movement and control operations.
-- Each such instruction takes a constant amount of time. 
-- RAM capacity is large enough.

- 指令一个接一个地执行，没有并行操作。
- 每次，程序指令都作为原子操作执行。 一条指令包括算术运算，逻辑运算，数据移动和控制操作。
- 每个这样的指令都需要一定的时间。 
- RAM容量足够大。

Under RAM model: count fundamental operations
在RAM模式下：统计基本操作
RAM 计算T(n)

T(n) 时间复杂度计算(练题)

顺序 — 串行

选择 — if 、switch-case

循环 — 插入排序

递归 — 归并排序

递归树 Recursion Tree

此为一般方法，了解画法

《算法导论》 The Construction of a Recursion Tree0.png

《算法导论》 The Construction of a Recursion Tree.png

Mathematical Induction

主方法 Master method (递归树的推广)

• It provides a “cookbook” method for solving recurrences of the form:
T(n) = a T(n/b) + f(n)
Where a ≥ 1 and b > 1 are constants and f(n) is an asymptotically positive function.

《算法导论》 Three common cases.png

CASE 1: The weight increases geometrically from the root to the leaves. The leaves hold a constant fraction of the total weight.从根部到叶片的重量几何增加。 叶子的总重量不变
CASE 2: (k = 0) The weight is approximately the same on each of the logbn levels.
        （k = 0）每个logbn级别的权重大致相同
CASE 3: The weight decreases geometrically from the root to the leaves. The root holds a constant fraction of the total weight.重量从根部到叶片的几何尺寸减小。 根具有总重量的不变部分。

《算法导论》 Master theorem – examples.png

Asymptotic Growth 渐近增长

O-notation

• O(g(n)) = {f(n): There exist positive constants c and n0 suchthat 0≤f(n) ≤cg(n) for all n≥n0}

–O(.) is used to asymptotically upper bound a function. 用于渐近地定义函数的上界
–O(.) is used to bound worst-case running time. 用于限制最坏情况下的运行时间

《算法导论》 O-notation.png

• Note:
--When we say “the running time is O(n2)” we mean that the worst-case running time is O(n2) – the best case might be better.
--Use of O-notation often makes it much easier to analyze algorithms; we can easily prove the O(n2) insertion-sort time bound.
--We often abuse the notation a little:
>> We often write f(n) = O(g(n)) instead of f(n) ∈O(g(n)) 
>> We often use O(n) in equations: e.g. 2n2 + 3n + 1 = 2n2 + O(n) (meaning that 2n2 + 3n + 1=2n2 + f(n) where f(n) is some function in O(n))
>>We use O(1) to denote constant time.

Ω-notation

• Ω(g(n)) = {f(n): There exist positive constants c and n0 suchthat 0 ≤ cg(n) ≤f(n) for all n≥n0}

–We use Ω-notation to give a lower bound on a function. 我们使用Ω-符号来给函数下限

《算法导论》 Ω-notation.png

• Note:
--when we say “the running time is Ω(n2)” we mean that the best-case running time is Ω(n2) – the worst case might be worse.

--Insertion-sort:
>> Best case: Ω(n) – when the input array is already sorted. 
>> Worst case: O(n2)– when the input array is reverse sorted.
>> We can also say that the worst case running time is Ω(n2)

Θ-notation

• Θ(g(n)) = {f(n): There exist positive constants c1, c2 and n0 such that 0 ≤ c1g(n) ≤f(n) ≤c2g(n) for all n≥n0} 相当于 “扔掉低阶项并忽略最高阶项前的系数”

–We use Θ-notation to give a tight bound on a function. 紧致界
— f(n) =Θ(g(n)) if and only if f(n) = O(g(n)) and f(n) = Ω(g(n))

《算法导论》 Θ-notation.png

O/Θ/Ω notations Note

--We often think of f(n) = O(g(n)) as corresponding to f(n) ≤ g(n). 上界
--Similarly, f(n) = Θ(g(n)) corresponds to f(n) = g(n)  紧致界
--Similarly, f(n) = Ω(g(n)) corresponds to f(n) ≥ g(n)  下界

Question: 证明 O(n^2)

设计

基于比较的排序算法

基于比较的排序算法这类排序包括插入，归并，堆，快速的等排序，共同点是都需要对数据进行比较，因此，时间复杂度不能突破O(NlogN)

假设决策树高度为h，叶子节点个数最多为2^h,要求2^h>=N!,所以h>=log(N!),根据Stirling公式，log(N!)~O(Nlog(N))，所以，h>=O(Nlog(N))

各种排序算法动画演示过程

Recap and overview

快排有最坏情况；插排、冒泡有最好情况；堆排、归并时间复杂度各情况一致

稳定性：插排、冒泡、归并、基数不稳定：快排、堆排

《算法导论》排序算法比较.png

-- Insertion sort running time of Θ(n2); sorts in place "原址排序"
-- Merge sort running time of Θ(n lg n); needs auxiliary storage Θ(n).
-- Heapsort running time of Θ(n lg n); sort in place. 
-- Quicksort running time of Θ(n lg n) on average; most practical (and hence widely-used) sorting algorithm. 
-- Sorting in linear time.

插入排序 Insertion sort

《算法导论》 Insertion Sort.png

归并排序 Merge Sort

《算法导论》 Action of Merge Sort.png
Merge-Sort (A, p, r).png
Merge (A, p, q, r).png

堆排序 Heap Sort

* 堆 是一种完全二叉树，不一定是满二叉树

《算法导论》 Action of BUILD-MAX-HEAP.png
MAX-HEAPIFY.png
BUILD-MAX-HEAP.png

《算法导论》 An Example for HEAPSORT.png

《算法导论》 HEAPSORT.png

Priority Queues

INSERT(S,x): 把元素插入集合S中. 等价于S = S ∪ {x}
MAXIMUM(S): 返回S中具有最大键值的元素
EXTRACT-MAX(S): 去掉并返回S中的具有最大键值的元素
INCREASE-KEY(S,x,k): 将元素x的关键字值增加到k,假设k不小于x的原关键字值

快速排序 Quick Sort

归并排序 谨遵 “分” “治” “合并” 步骤
快速排序 在 “分” 的时候就完成了 “合并” 过程；“合并”时 无操作

《算法导论》 pivot x.png
Divide and Conquer.png
Partitioning subroutine.png
Example of Partitioning.png
最坏情况下的快速排序示例.png
最好情况下的快速排序示例.png

线性排序(适用场合、前提、算法复杂度)

非基于比较的排序对输入数据有额外限制之后，可以使用一些时间复杂度更小的算法，如计数排序，桶排序，基数排序。

计数排序(Counting Sort)

计数排序的理解与实现

《算法导论》计数排序.jpeg

桶排序(Bucket Sort)

桶排序

《算法导论》桶排序.png

基数排序(Radix sort)

基数排序

《算法导论》基数排序.png

策略

分治法

Divide and Conquer algorithms which are analyzable by recurrences.

归并排序、快速排序
分、治—递归、合并

动态规划 Dynamic Programming

Optimization Problems(含公式)

A design technique, like divide-and-conquer.
Works bottom-up rather than top-down.
Useful for optimization problems.

• Four-step method:
1. Characterize the structure of the optimal solution.  刻画一个最优解的结构特征
2. Recursively define the value of the optimal solution. 递归地定义最优解的值
3. Compute the value of the solution in a bottom-up fashion. 自底而上计算最优解的值
4. Construct the optimal solution using the computed information. 利用计算出的信息构造最优解

Assembly-Line Scheduling

《算法导论》 Assembly Lines.png
Assembly-Line Scheduling code.png

Matrix-chain multiplication

《算法导论》 Matrix-chain multiplication.png
Matrix-Chain Multiplication DP Algo.png
ExampleDP forCMM.png
Construct an Optimal Solution.png

Longest Common Subsequence

《算法导论》 LCS.png
Computing the length of an LCS.png
Constructing an LCS.png
LCS MaxSum.png

MaxSum

$$
b[j] = max(b[j-1]+a[j],a[j]), if 1≤j ≤ n
$$

int DPMaxSum(int n,int a,int &besti,int &bestj){        
    int sum = a[0],b[0] = a[0];
    for(int i = 1 ; i<= n; ++i){
        if(b[i-1] > 0 )
            b[i] = b[i-1] + a[i];
        else b[i] = a[i];
        if(b[i]>sum) sum = b[i];
    }
    return sum;
}

int DPMaxSum(int n,int a,int &besti,int &bestj){        
    int sum = 0,b = 0;
    for(int i = 1 ; i<= n; ++i){
        if(b > 0)
            b += a[i];
        else b = a[i];
        if(b>sum) sum = b;
    }
    return sum;
}

0-1 Knapsack problem

加或不加，比较判断
0-1 Knapsack problem.png
$$
\begin{eqnarray}f(x)=
\begin{cases}
0 &if i = 0 or w = 0\cr
c[i-1,w] &w[i]>w \cr
max(c[i-1,w-w[i]] + vi, c[i-1,j]) &i>0andw>=w[i]\end{cases}
\end{eqnarray}
$$

KNAPSACK-DP(v,w,n,W)
    for w ← 0 to W  // if i = 0 or w = 0
        do c[0,w] = 0
    for i ← 1 to n
        do c[i,0] = 0   
            for w ← 1 to W 
                do if w[i] ≤ w  // i>0 and w[i] ≤ w
                   then if v[i] + c[i-1,w-w[i]] > c[i-1,w]
                           then c[i-1,w] = v[i] + c[i-1,w-w[i]]
                                else c[i-1,w] = c[i-1,w]
                   else c[i-1,w] = c[i-1,w]  // w[i] > w

贪心算法

Activity Selection

《算法导论》 Activity Selection.png

Fractional Knapsack

贪心算法并非最优解（w[] 已被排序）
Greedy for Fractional Knapsack.png

Huffman Coding

《算法导论》 Example of Huffman codes.png
Algorithm of Huffman Coding.png

图论

单源

Dijkstra “贪心”

集“点” — 松弛 — 集“点” — 松弛 …（直至集完所有点）

• IDEA: Greedy
1. Maintain a set S of vertices whose shortest-path distances from s are known
2. At each step add to S the vertex v∈V-S whose distance estimate from s is minimal.
3. Update the distance estimates of vertices adjacent to v.

《算法导论》 Example of Dijkstra’s algorithm.png
Dijkstra’s algorithm.png

Bellman-Ford “可动态规划实现”

穷举式“松弛”

Bellman-Ford algorithm: Finds all shortest-path lengths from a source s∈V to v∈V or determines that negative- weight cycle exists.

《算法导论》 Bellman-Ford.png
Example of Bellman-Ford.png
Bellman-Ford algorithm.png

bool Bellman_Ford()  
{  
    for(int i=1;i<=nodenum;i++){  //初始化  
        dis[i]=(i==original? 0:MAX);  
    } 
    
    for(int i=1;i<=nodenum-1;i++)  { // |V|-1次松弛
        for(int j=1;j<=edgenum;j++)  { 
            if(dis[edge[j].v]>dis[edge[j].u]+edge[j].cost){   //松弛 
                dis[edge[j].v]=dis[edge[j].u]+edge[j].cost;  
                pre[edge[j].v]=edge[j].u;  
            }  
        }
    } 
    bool flag=1; //判断是否含有负权回路  
    for(int i=1;i<=edgenum;i++)  {
        if(dis[edge[i].v] > dis[edge[i].u]+edge[i].cost)  {  
            flag=0;  
            break;  
        }  
    }
       
    return flag;  
}

全源 All-pairs shortest paths

Floyd-Warshall “DP 动态规划”

经过节点k的路径长度与直接到达的路径长度相比较

Floyd-Warshall算法(Floyd-Warshall algorithm)是解决 任意两点间的最短路径的一种算法，可以正确处理有向图或负权的最短路径问题，同时也被用于计算有向图的传递闭包。
• Floyd-Warshall算法的时间复杂度为O(N3)，空间复杂度 为O(N2)。
• 同样是动态规划算法，但效率更高!

《算法导论》 Floyd-Warshall recurrence.png
Floyd-Warshall.png
Pseudocode for Floyd-Warshall.png

Johnson’s “权值重定向+ dijkstra”

• Johson算法是目前最高效的在无负环可带负权重的网络中 求所有点对最短路径的算法.
• Johson算法是Bellman-Ford算法, Reweighting(重赋权重) 和Dijkstra算法的大综合.

Johnson算法思路
• Johnson算法适用于求All Pairs Shortest Path. Johnson算法应用了重标号技术。
1. 先进行一次Bellman-Ford算法
    - 给定图 G = (V, E)，增加一个新的顶点 s，使s指向图 G 中的所有顶点都建立连接，设新的图为 G’；
    - 对图 G’ 中顶点s使用 Bellman-Ford算法计算单源最短路径，得到结果 h[] = {h[0], h[1], .. h[V-1]}；

2. 对原图进行重赋权，w'(i,j)=h[i]-h[j]+w(i,j)
3. 移除新增的顶点 s，然后对每个点进行一次Dijkstra，
每次Dijkstra的复杂度为O(nlogn+m)，于是算法复杂度O(n^2logn+m)

- Given a function h: VR, reweight each edge (u,v) ∈E by wh(u,v)=w(u,v)+h(u)-h(v).
- 经过重赋权后，结点u,v 之间的 每条路径均相对 原图中路径长度变化了同样的值

《算法导论》 Johnson.png
Johnson’ algorithm.png

Shortest paths (Summary)

 • 单源点最短路径
-- 无负边(非负权值)
    >> Dijkstra 算法: O(E+VlgV)，贪心法(这个问题可以找 到最优解)，可以用不同的数据结构实现，复杂度不一样
-- 一般情况(可以为带有负权值图)
    >> Bellman-Ford: O(VE)，未采用贪心法，还可以确定有向图中是否含有负权值回路。
-- 有向加权无环图(DAG) 
    >> 有向无环图的最短路径算法
• 每对结点间最短路径 
-- 无负边
>> 执行 |V| 次 Dijkstra 算法: O(VE+V2lgV) 
-- 一般情况
√ Run Bellman-Ford once for each vertex
√ d(m)ij = mink{d(m-1)ik+akj } 
√ floyd-washall Algorithm 
√ Johnson Algorithm

    
• 四个算法的复杂度:
• Dijkstra算法直接实现时间复杂度是O(n2),空间复杂度是 O(n)(保存距离和路径)，二叉堆实现时间复杂度变成 O((V+E)logV)，Fibonacci Heap可以将复杂度降到 O(E+VlogV);
• Bellman-Ford算法时间复杂度是O(V*E)，空间复杂度是 时间复杂度是O(kE);
• Floyd-Warshall算法时间复杂度是O(n3)，空间复杂度是 O(n2);
• Johnson算法时间复杂度是O( V * E * lgd(V) )，比Floyd- Warshall算法效率高。

回溯法Back-Tracking Algorithms(深度优先)

• 回溯法又称试探法。回溯法的基本做法是"深度优先搜索"，是一种组织得井井有条的、能避免不必要重复搜索的穷举式搜索算法。即从一条 路往前走，能进则进，不能进则退回来，换一条路再试。
• 回溯算法使用剪枝(pruning)函数，剪去一些不可能到达最终状态(即答案状态)的节点，从而减少状态空间树节点的生成。搜索过程中用剪枝函数避免无效搜索。
    • 常用剪枝函数：
        • 用约束函数在扩展结点处剪去不满足约束的子树；
        • 用限界函数剪去得不到最优解的子树

☆State Space Tree
• Problem State
– Each node in the tree identifies a problem state when solving the problem. 
• State Space
– All the paths from the root to each node
• State Space Tree
– Theabovetreeofthesolutionspace
• Solution States
– Those problem states to which the path from the root identifies a vector in the solution space (Satisfying explicit constraints).
• Answer States
– Those solution states to which the path from the root identifies an answer (Satisfying implicit constraints).

利用回溯法解题的具体步骤: 
(1)描述解的形式，定义一个解空间，它包含问题的所有解。 
(2)构造状态空间树。
(3)构造约束函数(用于杀死节点)。
然后就要通过深度优先搜索思想完成回溯，完整过程如下:
(1)设置初始化的方案(给变量赋初值，读入已知数据等)。 
(2)变换方式去试探，若全部试完则转(7)。 
(3)判断此法是否成功(通过约束函数)，不成功则转(2)。 
(4)试探成功则前进一步再试探。 
(5)正确方案还未找到则转(2)。 
(6)已找到一种方案则记录并打印。 
(7)退回一步(回溯)，若未退到头则转(2)。 
(8)已退到头则结束或打印无解

(一)约束函数:约束函数是根据题意定出的。通过描述合法解的一般 特征用于去除不合法的解，从而避免继续搜索出这个不合法解的剩余 部分。因此，约束函数是对于任何状态空间树上的节点都有效、等价 的。
(二)状态空间树:状态空间树是一个对所有解的图形描述。树上的每个子节点的解都只有一个部分与父节点不同。
(三)扩展节点、活结点、死结点
• 扩展节点，就是当前正在求出它的子节点的节点，在深度优先搜索中，只允许有一个扩展节点。
• 活结点就是通过与约束函数的对照，节点本身和其父节点均满足约束 函数要求的节点;
• 死结点反之。由此很容易知道死结点是不必求出其子节点的(没有意义)。

《算法导论》非递归回溯框架.png
General Method of Back Tracking.png
递归回溯框架.png
Recursive Method of Back Tracking.png