【Android应用源码分析】android中HashMap的替代者——SparseArray 源码分析

2019年6月27日 294次阅读来源: 牧之丶

转载请注明出处：http://blog.csdn.net/zhoubin1992/article/details/48269985

前言

当我们定义

HashMap<Integer, E> hashMap = new HashMap<Integer, E>(); 时IDE会给出一个警告：
用SparseArray<E>来替代，以获取更好性能。HashMap的get和put时间复杂度是O(1)呀，为什么SparseArray的性能会更好？这两天花时间详细分析了下源码，发现SparseArray的优点主要是节约内存，非常适合于移动端。

SparseArray是在android.util下的一个工具类。首先我们来看下SparseArray的方法结构：

《【Android应用源码分析】android中HashMap的替代者——SparseArray 源码分析》

会发现内容并不多，就是构造方法和增删改查的一些方法。

源码分析

源码详细分析见注释：

package android.util;

import com.android.internal.util.ArrayUtils;
import com.android.internal.util.GrowingArrayUtils;

import libcore.util.EmptyArray;

/** * SparseArrays map integers to Objects. Unlike a normal array of Objects, * there can be gaps in the indices. It is intended to be more memory efficient * than using a HashMap to map Integers to Objects, both because it avoids * auto-boxing keys and its data structure doesn't rely on an extra entry object * for each mapping. * SparseArray用于映射integers到object。但不像普通数组那样，sparseArray的元素间没有无用元素。 * 在映射integers到object的过程中，SparseArray由于采用避免自动装箱的keys和它的数据结构不依赖额外 * 的对象来存储映射关系的实现，因此它比hashMap的内存使用更高效一些。 * <p>Note that this container keeps its mappings in an array data structure, * using a binary search to find keys. The implementation is not intended to be appropriate for * data structures * that may contain large numbers of items. It is generally slower than a traditional * HashMap, since lookups require a binary search and adds and removes require inserting * and deleting entries in the array. For containers holding up to hundreds of items, * the performance difference is not significant, less than 50%.</p> * 注意：SparseArray在查找keys的过程中采用了二分查找， 这种实现不适合数据量大的情况。由于查找时要用到二分查找， * 添加删除时涉及到数组其他元素的挪动，因此通常SparseArray会比hashMap慢。当处理上百的数据量，这种性能差异不是特别 * 明显，性能差异不超过50%。 * <p>To help with performance, the container includes an optimization when removing * keys: instead of compacting its array immediately, it leaves the removed entry marked * as deleted. The entry can then be re-used for the same key, or compacted later in * a single garbage collection step of all removed entries. This garbage collection will * need to be performed at any time the array needs to be grown or the the map size or * entry values are retrieved.</p> *为了优化性能，SparseArray针对remove case作了优化，remove时它不是立即挤压数组空间，而是标记为delete。 * 这个被标记的元素要么被重复利用，要么在多次remove之后通过一次gc操作中被挤压出去。 * gc需要在下列情况之前被执行：数组要扩容；获取SparseArray容量；get values（更详细的见代码注释）； * <p>It is possible to iterate over the items in this container using * {@link #keyAt(int)} and {@link #valueAt(int)}. Iterating over the keys using * <code>keyAt(int)</code> with ascending values of the index will return the * keys in ascending order, or the values corresponding to the keys in ascending * order in the case of <code>valueAt(int)</code>.</p> * 可以用keyAt valueAt实现遍历.... */
//E对应HashMap的Value
public class SparseArray<E> implements Cloneable {
    // 用来优化删除性能（当有元素被remove delete时），标记已经删除的对象
    private static final Object DELETED = new Object();
    // 用来优化删除性能，标记是否需要垃圾回收
    private boolean mGarbage = false;
    // 存储索引，整数索引（key为整数）从小到大被映射在该数组
    private int[] mKeys;
    // 存储对象（Value）
    private Object[] mValues;
    // SparseArray实际大小
    private int mSize;

    /** * Creates a new SparseArray containing no mappings. */
    public SparseArray() {
        //默认容量是10个元素
        this(10);
    }

    /** * Creates a new SparseArray containing no mappings that will not * require any additional memory allocation to store the specified * number of mappings. If you supply an initial capacity of 0, the * sparse array will be initialized with a light-weight representation * not requiring any additional array allocations. */
    public SparseArray(int initialCapacity) {
        if (initialCapacity == 0) {
             //mKeys的初值等于new int[0],mValues的初值等于new Object[0]
            mKeys = EmptyArray.INT;
            mValues = EmptyArray.OBJECT;
        } else {
            //newUnpaddedObjectArray最后指向了VMRuntime的一个native方法,返回一个至少长initialCapacity的数组，
            //但可能更大。增长的大小来自于避免数组后的任何padding。padding的大小依赖于componentType和内存分配器的实现。
            mValues = ArrayUtils.newUnpaddedObjectArray(initialCapacity);
            mKeys = new int[mValues.length];
        }
        mSize = 0;
    }

    @Override
    @SuppressWarnings("unchecked")
    public SparseArray<E> clone() {
        SparseArray<E> clone = null;
        try {
            //java深拷贝
            clone = (SparseArray<E>) super.clone();
            clone.mKeys = mKeys.clone();
            clone.mValues = mValues.clone();
        } catch (CloneNotSupportedException cnse) {
            /* ignore */
        }
        return clone;
    }

    /** * Gets the Object mapped from the specified key, or <code>null</code> * if no such mapping has been made. */
    /** * 获得指定key的映射对象，或者null如果没有该映射。 */
    public E get(int key) {
        return get(key, null);
    }

    /** * Gets the Object mapped from the specified key, or the specified Object * if no such mapping has been made. */
    @SuppressWarnings("unchecked")
    public E get(int key, E valueIfKeyNotFound) {
        //二分查找
        int i = ContainerHelpers.binarySearch(mKeys, mSize, key);
        // 如果没找到或者该value已经被标记删除,则返回默认值
        if (i < 0 || mValues[i] == DELETED) {
            return valueIfKeyNotFound;
        } else {
             // i>0 且该位置的元素未被标记为待删除，返回该值mValues[i]
            return (E) mValues[i];
        }
    }

    /** * Removes the mapping from the specified key, if there was any. */
    /** * 删除指定key的映射对象。 */
    public void delete(int key) {
        //二分查找
        int i = ContainerHelpers.binarySearch(mKeys, mSize, key);
        //找到了
        if (i >= 0) {
             //若未被标记delete，标记为delete，回收mGarbage=true
            if (mValues[i] != DELETED) {
                mValues[i] = DELETED;
                mGarbage = true;
            }
        }
    }

    /** * Alias for {@link #delete(int)}. */
    public void remove(int key) {
        delete(key);
    }

    /** * Removes the mapping at the specified index. */
    //移除特定位置的元素，注意传入的是mValues的index不是Key
    public void removeAt(int index) {
        //若未被标记delete，标记为delete，回收mGarbage=true
        if (mValues[index] != DELETED) {
            mValues[index] = DELETED;
            mGarbage = true;
        }
    }

    /** * Remove a range of mappings as a batch. * * @param index Index to begin at * @param size Number of mappings to remove */
    public void removeAtRange(int index, int size) {
        //确定结束位置
        final int end = Math.min(mSize, index + size);
        //从起点开始循环 remove
        for (int i = index; i < end; i++) {
            removeAt(i);
        }
    }
    //目的只有一个压缩空间（压缩数组，把无效的值删除）
    private void gc() {
        // Log.e("SparseArray", "gc start with " + mSize);
        int n = mSize;
        int o = 0;
        int[] keys = mKeys;
        Object[] values = mValues;
        //循环整个元素区间，删除值为DELETED的数，这里比较巧妙，直接对同一个keys和values操作，完成元素的删除和移动！
        //大家注意这里的keys和mKeys是指向同一个地址（values 和mValues类似），所以改变keys、values 的值也即改变了mKeys、mValues。
        for (int i = 0; i < n; i++) {
            Object val = values[i];

            if (val != DELETED) {
                if (i != o) {
                    keys[o] = keys[i];
                    values[o] = val;
                    values[i] = null;
                }
                o++;
            }
        }
        mGarbage = false;
        mSize = o;//实际大小

        // Log.e("SparseArray", "gc end with " + mSize);
    }

    /** * Adds a mapping from the specified key to the specified value, * replacing the previous mapping from the specified key if there * was one. */
    /** * 添加一个指定key到指定object的映射，如果之前有一个指定key的映射则直接替换掉原映射object。注意gc。 */
    public void put(int key, E value) {
        //先二分查找，确定插入位置，保证了key数组的有序性
        int i = ContainerHelpers.binarySearch(mKeys, mSize, key);

        if (i >= 0) {
            //找到了，直接替换
            mValues[i] = value;
        } else {
            // 做一个取反运算，获得应该插入的index
            //没找到的情况下： i = -insertPoint -1，对他取反刚好得insertPoint。
            i = ~i;
            //若i在size范围内，且刚好对应位置标记为delete了，直接放入
            if (i < mSize && mValues[i] == DELETED) {
                mKeys[i] = key;
                mValues[i] = value;
                return;
            }
            //若前面if不成立，即i超出了size范围，或者对应的位置的元素是有效的
            // 如果被标记为需要垃圾回收且SparseArray大小不小于keys数组长度
            if (mGarbage && mSize >= mKeys.length) {
                // 压缩空间，会压缩数组，把无效的值都去掉，保证连续有效值
                gc();
                // Search again because indices may have changed.
                // 再次查找插入点因为索引可能改变
                i = ~ContainerHelpers.binarySearch(mKeys, mSize, key);
            }
            // 插入，如果size不够则会重新分配更大的数组，然后拷贝过去并插入；size足够则用System.arraycopy把插入位置开始的value都后移然后插入
            mKeys = GrowingArrayUtils.insert(mKeys, mSize, i, key);
            mValues = GrowingArrayUtils.insert(mValues, mSize, i, value);
            // 实际大小加1
            mSize++;
        }
    }

    /** * Returns the number of key-value mappings that this SparseArray * currently stores. */
    //返回mSize，注意gc。
    public int size() {
        if (mGarbage) {
            gc();
        }

        return mSize;
    }

    /** * Given an index in the range <code>0...size()-1</code>, returns * the key from the <code>index</code>th key-value mapping that this * SparseArray stores. * * <p>The keys corresponding to indices in ascending order are guaranteed to * be in ascending order, e.g., <code>keyAt(0)</code> will return the * smallest key and <code>keyAt(size()-1)</code> will return the largest * key.</p> */
  //返回索引为index的mKeys值，注意gc。
    public int keyAt(int index) {
        if (mGarbage) {
            gc();
        }

        return mKeys[index];
    }

    /** * Given an index in the range <code>0...size()-1</code>, returns * the value from the <code>index</code>th key-value mapping that this * SparseArray stores. * * <p>The values corresponding to indices in ascending order are guaranteed * to be associated with keys in ascending order, e.g., * <code>valueAt(0)</code> will return the value associated with the * smallest key and <code>valueAt(size()-1)</code> will return the value * associated with the largest key.</p> */
    @SuppressWarnings("unchecked")
    //返回索引为index的mValues值，注意gc。
    public E valueAt(int index) {
        if (mGarbage) {
            gc();
        }

        return (E) mValues[index];
    }

    /** * Given an index in the range <code>0...size()-1</code>, sets a new * value for the <code>index</code>th key-value mapping that this * SparseArray stores. */
    //设置索引为index的mValues值为value，注意gc。 
    public void setValueAt(int index, E value) {
        if (mGarbage) {
            gc();
        }

        mValues[index] = value;
    }

    /** * Returns the index for which {@link #keyAt} would return the * specified key, or a negative number if the specified * key is not mapped. */
    //返回值为key在mKeys中的index，注意gc。 
    public int indexOfKey(int key) {
        if (mGarbage) {
            gc();
        }

        return ContainerHelpers.binarySearch(mKeys, mSize, key);
    }

    /** * Returns an index for which {@link #valueAt} would return the * specified key, or a negative number if no keys map to the * specified value. * <p>Beware that this is a linear search, unlike lookups by key, * and that multiple keys can map to the same value and this will * find only one of them. * <p>Note also that unlike most collections' {@code indexOf} methods, * this method compares values using {@code ==} rather than {@code equals}. */
    //返回值为value在mValues中的index，注意gc。 
    public int indexOfValue(E value) {
        if (mGarbage) {
            gc();
        }

        for (int i = 0; i < mSize; i++)
            if (mValues[i] == value)
                return i;

        return -1;
    }

    /** * Removes all key-value mappings from this SparseArray. */
    //清空SparseArray
    public void clear() {
        int n = mSize;
        Object[] values = mValues;

        for (int i = 0; i < n; i++) {
            //值空，利于jvm gc
            values[i] = null;
        }

        mSize = 0;
        mGarbage = false;
    }

    /** * Puts a key/value pair into the array, optimizing for the case where * the key is greater than all existing keys in the array. */
    //往SparseArray加入键值对key/value
    public void append(int key, E value) {
        //若key小于等于已有的最大key，直接Put
        if (mSize != 0 && key <= mKeys[mSize - 1]) {
            put(key, value);
            return;
        }

        if (mGarbage && mSize >= mKeys.length) {
            gc();
        }
        //若key大于了现有的所有key，就不用走put的二分查找过程了，直接append
        mKeys = GrowingArrayUtils.append(mKeys, mSize, key);
        mValues = GrowingArrayUtils.append(mValues, mSize, value);
        mSize++;
    }

    /** * {@inheritDoc} * * <p>This implementation composes a string by iterating over its mappings. If * this map contains itself as a value, the string "(this Map)" * will appear in its place. */
    @Override
    public String toString() {
        if (size() <= 0) {
            return "{}";
        }

        StringBuilder buffer = new StringBuilder(mSize * 28);
        buffer.append('{');
        for (int i=0; i<mSize; i++) {
            if (i > 0) {
                buffer.append(", ");
            }
            int key = keyAt(i);
            buffer.append(key);
            buffer.append('=');
            Object value = valueAt(i);
            if (value != this) {
                buffer.append(value);
            } else {
                buffer.append("(this Map)");
            }
        }
        buffer.append('}');
        return buffer.toString();
    }
}

SparseArray性能体现

阅读上面代码，我来总结下SparseArray的性能体现：
1. SparseArray用于映射integers到object。因为key是一个整数数组，避免了自动装箱的keys和不依赖额外的数据结构去映射K/V关系，从而节省内存，比hashMap的内存使用更高效。
2. SparseArray在get、put中采用二分查找，添加删除元素时涉及到数组其他元素的挪动，因此通常SparseArray会比hashMap慢。当处理上百的数据量，这种性能差异不是特别明显，性能差异不超过50%。所以SparseArray不适合数据量大的情况。
二分查找函数（binarySearch）代码如下：

private static int binarySearch(int[] a, int start, int len, int key) {
        int high = start + len, low = start - 1, guess;

        while (high - low > 1) {
            guess = (high + low) / 2;

            if (a[guess] < key)
                low = guess;
            else
                high = guess;
        }

        if (high == start + len)
            return ~(start + len);
        else if (a[high] == key)
            return high;
        else
            return ~high;
    }

返回值是插入位置，没找到的情况下： i = -insertPoint -1 <0，对他取反(i=~i)刚好得insertPoint。
3.由于key数组需要有序，所以每次的put操作更费时，要二分查找，要在数组删除插入元素。所以对应地，SparseArray针对remove作了优化，remove元素时不是立即压缩数组空间，而是把需要remove的元素标记为delete，同时设置垃圾回收标志mGarbage。这个被标记的元素要么被重复利用，要么在多次remove之后通过一次gc操作中被挤压出去。

关于SparseArray的gc方法

注意gc方法只在如下方法会执行，在remove和get时不会执行。这就是针对remove作了优化，将可能的多次gc操作变为一次完成，见下一条。
用删除标记DELETED结合垃圾回收标记mGarbage减少gc次数。如果每次remove都执行一次gc(循环数组删除移动)，则性能下降。采用优化后，多次remove仅多次设置了标志，在gc触发时，仅需要一次循环就可以将空间压缩好。
gc方法的目的只有一个：压缩空间（压缩数组，把无效的值删除）。查看代码会发现循环整个元素区间，删除值为DELETED的数，这里比较巧妙，直接对同一个keys和values操作，完成元素的删除和移动！同时注意这里的keys和mKeys是指向同一个地址（values 和mValues类似），所以改变keys、values 的值也即改变了mKeys、mValues。
每次gc过程，保证了他的数组(mSize)区间内没有无效值。也是稀疏数组的精髓。

总结

与HashMap比较

HashMap的相关基础和源码见 JAVA 面向对象和集合知识点总结（6.15更新）
HashMap的get put时间复杂度O(1)的代价就是耗费大量内存来存储数据。
HashMap ：消耗内存空间，性能好。
SparseArray ：节约内存空间，性能稍差(上百的数据量性能差异不超过50%)。

适用情况

key为整数，同时K/V对数量不是太大，可以考虑使用SparseArray能更节省内存并且性能损耗不大。所以对于移动端这种内存珍贵来说，非常适用。

在Android中，用SparseArray<E> sparseArray = new SparseArray<E>(); 来替换

HashMap<Integer, E> hashMap = new HashMap<Integer, E>()吧~

    原文作者：牧之丶
    原文地址: https://blog.csdn.net/zhoubin1992/article/details/48269985
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。