算法题--（i & -i）的特性（Leetcode No.315）

2019年11月8日 112次阅读

最近在leetcode上做了一道题(No.315)，查看一份耗时较少的代码时，发现其思路与其他的做法不同，没有采用“分治”的思想，但同样达到了较高的效率。

题目：315. Count of Smaller Numbers After Self

You are given an integer array nums and you have to return a new counts array. The counts array has the property where counts[i] is the number of smaller elements to the right of nums[i].

Example:

Given nums = [5, 2, 6, 1]

To the right of 5 there are 2 smaller elements (2 and 1).

To the right of 2 there is only 1 smaller element (1).

To the right of 6 there is 1 smaller element (1).

To the right of 1 there is 0 smaller element.

Return the array [2, 1, 1, 0].

遇到这种情况，很容易就想到“分治”算法，所以在这里并不准备介绍分治算法实现的代码。

一份特殊的代码：

class Solution {

public:

vector<int> countSmaller(vector<int>& nums) {

int n = nums.size();

vector<int> ans(n);

if (n == 0) return ans;

//discretize

int mn = INT_MAX, mx = INT_MIN;

for (auto a : nums) mn = min(mn, a);

for (auto &a : nums) {

a = a – mn + 1;

mx = max(mx, a);

}

vector<int> c(mx+1, 0);

for (int i = n-1; i >= 0; –i) {

ans[i] = getsum(c, nums[i]-1);

modify(c, mx, nums[i]);

}

return ans;

}

private:

void modify(vector<int> &c, int n, int i) {

while (i <= n) {

++c[i];

i += i & -i;

}

int getsum(vector<int> &c, int i) {

int sum = 0;

while (i > 0) {

sum += c[i];

i -= i & -i;

}

return sum;

}

};

代码分析：

刚开始看代码根本看不懂，但是看到代码中出现的 i & -i ，感觉可能与此有关，就算了一下100以内 i & -i 的值，结果如下：

i	0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
i&-i	0 1 2 1 4 1 2 1 8 1 2 1 4 1 2 1 16 1 2 1
i	20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
i&-i	4 1 2 1 8 1 2 1 4 1 2 1 32 1 2 1 4 1 2 1
i	40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
i&-i	8 1 2 1 4 1 2 1 16 1 2 1 4 1 2 1 8 1 2 1
i	60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79
i&-i	4 1 2 1 64 1 2 1 4 1 2 1 8 1 2 1 4 1 2 1
i	80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
i&-i	16 1 2 1 4 1 2 1 8 1 2 1 4 1 2 1 32 1 2 1

观察上表，可以获得一些有意思的现象，其与负数在计算机底层的表述关系有关，在此不做赘述。一些比较直观的数据特点如下：

1. 凡是2的N次方，其结果是其本身；

2. 凡是奇数，其结果都是1；

3. 其余所有数，其结果与其能由多少个2的N次方的数相加有关（例如：80=64+16，其结果就是16；76=64+8+4，其结果是4；94=64+16+8+4+2，其结果就是2）。

这些结果与问题无关，但确实是上述代码具有较高效率的原因。

观察代码可知，其利用了一个数组 c 。按照常规思路，我能想到的大致有下面几种：

A. 数组c中第N位保存的应该是前i个元素中等于N的元素的个数，然后每读取一个元素，统计数组中小于其的元素的个数。很明显，由于数组中有大量为0的元素存在，所以反而降低了效率；

B. 数组c中第N位保存的应该是前i个元素中小于N的元素的个数，然后每读取一个元素，获得小于其的元素的个数，但是，为了维护这个数组，每加入一个元素，都要调整数组中N位以后的值，同样降低了效率。

如果采用上述思路，用hash表反而不如直接维护一个前N个元素组成的递增序列，所以上述代码中利用了 i & -i 的特性，采用了一种比较巧妙的维护方式。

假设N能够表示为 A1+A2+……+Ak，其中A1,A2,……,Ak是2的N次方，且有A1>A2>……>Ak。

则数组中第N位元素表示：前i个元素中，大于N-Ak且小于N的元素的个数。

这样一来，利用 i & -i 的特性，每读取一个新元素，无论是维护数组，还是获得结果，其时间复杂度都降为O(logN)，总的时间复杂度是O(N logN)，这就是上述代码效率较高的原因。

而且，即使采用“分治”的算法，其时间复杂度也是O(N logN)。