HashMap及HashTable源码解析
HashMap在java和Android经常使用到,之前学过数据结构,理解了它的原理,却很少花时间去阅读它的源码,今天索性对进行分析。
本篇文章主要分析HashMap与HashTable的实现原理及区别,关于ConcurrentHashMap本篇文章暂时不分析,有兴趣了解的课参考 这篇文章:http://www.cnblogs.com/ITtangtang/p/3948786.html,本人觉得不错。
转载请注明原博客地址:http://blog.csdn.net/gdutxiaoxu/article/details/51492390
一、HashTable与HashMap的区别
HashTable和HashMap采用相同的存储机制,二者的实现基本一致,不同的是:
1、HashMap是非线程安全的,HashTable是线程安全的,内部的方法基本都是synchronized。
2、HashTable不允许有null值的存在。
在HashTable中调用put方法时,如果key为null,直接抛出NullPointerException。其它细微的差别还有,比如初始化Entry数组的大小等等,但基本思想和HashMap一样
二:下面我们将从源码的角度来分析HashMap:
阅读源码之前我们先了解两个基本知识
1)它是一种用键映射值的数据结构,在源码里面的体现是用静态内部类Entry封装
[java] view plain copy
- static class Entry<K,V> implements Map.Entry<K,V> {
- final K key;
- V value;
- Entry<K,V> next;
- int hash;
- /**
- * Creates new entry.
- */
- Entry(int h, K k, V v, Entry<K,V> n) {
- value = v;
- next = n;
- key = k;
- hash = h;
- }
- }
2)里面是怎样实现存储多个节点的?其实就是用数组来存储的 Entry<K,V>[] table
[java] view plain copy
- static final Entry<?,?>[] EMPTY_TABLE = {};
- /**
- * The table, resized as necessary. Length MUST Always be a power of two.
- */
- transient Entry<K,V>[] table = (Entry<K,V>[]) EMPTY_TABLE;
- /**
- * The number of key-value mappings contained in this map.
- */
- transient int size;
- /**
- * The next size value at which to resize (capacity * load factor).
- * @serial
- */
- // If table == EMPTY_TABLE then this is the initial capacity at which the
- // table will be created when inflated.
- int threshold;
下面讲解几个比较重要的方法
1)当我们进行put操作的时候
先贴出源码
[java] view plain copy
- public V put(K key, V value) {
- if (key == null)
- return putForNullKey(value); //处理null值
- int hash = hash(key.hashCode());//计算hash
- int i = indexFor(hash, table.length);//计算在数组中的存储位置
- //遍历table[i]位置的链表,查找相同的key,若找到则使用新的value替换掉原来的oldValue并返回oldValue
- for (Entry<K,V> e = table[i]; e != null; e = e.next) {
- Object k;
- if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
- V oldValue = e.value;
- e.value = value;
- e.recordAccess(this);
- return oldValue;
- }
- }
- //若没有在table[i]位置找到相同的key,则添加key到table[i]位置,新的元素总是在table[i]位置的第一个元素,原来的元素后移
- modCount++;
- addEntry(hash, key, value, i);
- return null;
- }
a HashMap会对null值key进行特殊处理,总是放到table[0]位置
b计算has值
c 找到在table数组中的索引
1)遍历table[i]位置的链表,查找相同的key,若找到则使用新的value替换掉原来的oldValue并返回oldValue
2)若没有在table[i]位置找到相同的key,则添加key到table[i]位置,新的元素总是在table[i]位置的第一个元素,原来的元素后调用addEntry方法
[java] view plain copy
- void addEntry(int hash, K key, V value, int bucketIndex) {
- if ((size >= threshold) && (null != table[bucketIndex])) {
- resize(2 * table.length);
- hash = (null != key) ? hash(key) : 0;
- bucketIndex = indexFor(hash, table.length);
- }
- createEntry(hash, key, value, bucketIndex);
- }
[java] view plain copy
- void createEntry(int hash, K key, V value, int bucketIndex) {
- Entry<K,V> e = table[bucketIndex];
- table[bucketIndex] = new Entry<>(hash, key, value, e);
- size++;
- }
所做的工作就是:
1)判断当元素数量达到临界值(capactiy*factor)时,则进行扩容,是table数组长度变为table.length*2
2)当table[index]已存在其它元素时,会在table[index]位置形成一个链表,将新添加的元素放在table[index],原来的元素通过Entry的next进行链接,这样以链表形式解决hash冲突问题,
2)get方法
同样当key为null时会进行特殊处理,在table[0]的链表上查找key为null的元素
[java] view plain copy
- public V get(Object key) {
- if (key == null)
- return getForNullKey();
- Entry<K,V> entry = getEntry(key);
- return null == entry ? null : entry.getValue();
- }
get的过程是先计算hash然后通过hash与table.length取摸计算index值,然后遍历table[index]上的链表,直到找到key,
然后返回,找不到返回null
[java] view plain copy
- final Entry<K,V> getEntry(Object key) {
- if (size == 0) {
- return null;
- }
- int hash = (key == null) ? 0 : hash(key);
- for (Entry<K,V> e = table[indexFor(hash, table.length)];
- e != null;
- e = e.next) {
- Object k;
- if (e.hash == hash &&
- ((k = e.key) == key || (key != null && key.equals(k))))
- return e;
- }
- return null;
- }
3)remove方法
remove方法和put get类似,计算hash,计算index,然后遍历查找,将找到的元素从table[index]链表移除
[java] view plain copy
- public V remove(Object key) {
- Entry<K,V> e = removeEntryForKey(key);
- return (e == null ? null : e.value);
- }
[java] view plain copy
- final Entry<K,V> removeEntryForKey(Object key) {
- if (size == 0) {
- return null;
- }
- int hash = (key == null) ? 0 : hash(key);
- int i = indexFor(hash, table.length);
- Entry<K,V> prev = table[i];
- Entry<K,V> e = prev;
- while (e != null) {
- Entry<K,V> next = e.next;
- Object k;
- if (e.hash == hash &&
- ((k = e.key) == key || (key != null && key.equals(k)))) {
- modCount++;
- size–;
- if (prev == e)
- table[i] = next;
- else
- prev.next = next;
- e.recordRemoval(this);
- return e;
- }
- prev = e;
- e = next;
- }
- return e;
- }
4)clear()方法
clear方法非常简单,就是遍历table然后把每个位置置为null,同时修改元素个数为0
需要注意的是clear方法只会清楚里面的元素,并不会重置capactiy
[java] view plain copy
- public void clear() {
- modCount++;
- Arrays.fill(table, null);
- size = 0;
- }
5)containsKey和containsValue
containsKey方法是先计算hash然后使用hash和table.length取摸得到index值,遍历table[index]元素查找是否包含key相同的值
[java] view plain copy
- public boolean containsKey(Object key) {
- return getEntry(key) != null;
- }
[java] view plain copy
- final Entry<K,V> getEntry(Object key) {
- if (size == 0) {
- return null;
- }
- int hash = (key == null) ? 0 : hash(key);
- for (Entry<K,V> e = table[indexFor(hash, table.length)];
- e != null;
- e = e.next) {
- Object k;
- if (e.hash == hash &&
- ((k = e.key) == key || (key != null && key.equals(k))))
- return e;
- }
- return null;
- }
6)containsValue方法就比较粗暴了,就是直接遍历所有元素直到找到value,由此可见HashMap的containsValue方法本质上和普通数组和list的contains方法没什么区别,你别指望它会像containsKey那么高效
[java] view plain copy
- public boolean containsValue(Object value) {
- // Same idea as size()
- if (value == null)
- throw new NullPointerException();
- final Segment<K,V>[] segments = this.segments;
- boolean found = false;
- long last = 0;
- int retries = –1;
- try {
- outer: for (;;) {
- if (retries++ == RETRIES_BEFORE_LOCK) {
- for (int j = 0; j < segments.length; ++j)
- ensureSegment(j).lock(); // force creation
- }
- long hashSum = 0L;
- int sum = 0;
- for (int j = 0; j < segments.length; ++j) {
- HashEntry<K,V>[] tab;
- Segment<K,V> seg = segmentAt(segments, j);
- if (seg != null && (tab = seg.table) != null) {
- for (int i = 0 ; i < tab.length; i++) {
- HashEntry<K,V> e;
- for (e = entryAt(tab, i); e != null; e = e.next) {
- V v = e.value;
- if (v != null && value.equals(v)) {
- found = true;
- break outer;
- }
- }
- }
- sum += seg.modCount;
- }
- }
- if (retries > 0 && sum == last)
- break;
- last = sum;
- }
- } finally {
- if (retries > RETRIES_BEFORE_LOCK) {
- for (int j = 0; j < segments.length; ++j)
- segmentAt(segments, j).unlock();
- }
- }
- return found;
- }
2 HashTable源码分析
1)构造方法有三个 ,带零个、一个、两个参数的,最终都会调用两个参数的构造方法
其思想也比较简单,检查参数的合法性
public Hashtable(int initialCapacity, float loadFactor) {
if (initialCapacity < 0)
throw new IllegalArgumentException("Illegal Capacity: "+
initialCapacity);
if (loadFactor <= 0 || Float.isNaN(loadFactor))
throw new IllegalArgumentException("Illegal Load: "+loadFactor);
if (initialCapacity==0)
initialCapacity = 1;
this.loadFactor = loadFactor;
table = new Entry[initialCapacity];
threshold = (int)Math.min(initialCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
initHashSeedAsNeeded(initialCapacity);
}
2)put操作,直接在方法的前面加上synchronized 关键字,简单粗暴,在并发激烈的情况下,效率较低,所以才会有
ConcurrentHashMap的诞生。
public synchronized V put(K key, V value) {
// Make sure the value is not null
if (value == null) {
throw new NullPointerException();
}
// Makes sure the key is not already in the hashtable.
Entry tab[] = table;
int hash = hash(key);
int index = (hash & 0x7FFFFFFF) % tab.length;
for (Entry<K,V> e = tab[index] ; e != null ; e = e.next) {
if ((e.hash == hash) && e.key.equals(key)) {
V old = e.value;
e.value = value;
return old;
}
}
modCount++;
if (count >= threshold) {
// Rehash the table if the threshold is exceeded
rehash();
tab = table;
hash = hash(key);
index = (hash & 0x7FFFFFFF) % tab.length;
}
// Creates the new entry.
Entry<K,V> e = tab[index];
tab[index] = new Entry<>(hash, key, value, e);
count++;
return null;
}
3)get操作也一样,直接在访问该方法的时候上锁,确保同一个时刻只有一个线程能访问
public synchronized V get(Object key) {
Entry tab[] = table;
int hash = hash(key);
int index = (hash & 0x7FFFFFFF) % tab.length;
for (Entry<K,V> e = tab[index] ; e != null ; e = e.next) {
if ((e.hash == hash) && e.key.equals(key)) {
return e.value;
}
}
return null;
}
4)其他方法就不说了,基本更HashMap里面的方法一样,只是加上同步机制而已
3自己实现的一个简单HashMap
public class MyHashMap {
//默认初始化大小 16
private static final int DEFAULT_INITIAL_CAPACITY = 16;
//默认负载因子 0.75
private static final float DEFAULT_LOAD_FACTOR = 0.75f;
//临界值
private int threshold;
//元素个数
private int size;
//扩容次数
private int resize;
private HashEntry[] table;
public MyHashMap() {
table = new HashEntry[DEFAULT_INITIAL_CAPACITY];
threshold = (int) (DEFAULT_INITIAL_CAPACITY * DEFAULT_LOAD_FACTOR);
size = 0;
}
private int index(Object key) {
//根据key的hashcode和table长度取模计算key在table中的位置
return key.hashCode() % table.length;
}
public void put(Object key, Object value) {
//key为null时需要特殊处理,为简化实现忽略null值
if (key == null) return;
int index = index(key);
//遍历index位置的entry,若找到重复key则更新对应entry的值,然后返回
HashEntry entry = table[index];
while (entry != null) {
if (entry.getKey().hashCode() == key.hashCode() && (entry.getKey() == key || entry.getKey().equals(key))) {
entry.setValue(value);
return;
}
entry = entry.getNext();
}
//若index位置没有entry或者未找到重复的key,则将新key添加到table的index位置
add(index, key, value);
}
private void add(int index, Object key, Object value) {
//将新的entry放到table的index位置第一个,若原来有值则以链表形式存放
HashEntry entry = new HashEntry(key, value, table[index]);
table[index] = entry;
//判断size是否达到临界值,若已达到则进行扩容,将table的capacicy翻倍
if (size++ >= threshold) {
resize(table.length * 2);
}
}
private void resize(int capacity) {
if (capacity <= table.length) return;
HashEntry[] newTable = new HashEntry[capacity];
//遍历原table,将每个entry都重新计算hash放入newTable中
for (int i = 0; i < table.length; i++) {
HashEntry old = table[i];
while (old != null) {
HashEntry next = old.getNext();
int index = index(old.getKey());
old.setNext(newTable[index]);
newTable[index] = old;
old = next;
}
}
//用newTable替table
table = newTable;
//修改临界值
threshold = (int) (table.length * DEFAULT_LOAD_FACTOR);
resize++;
}
public Object get(Object key) {
//这里简化处理,忽略null值
if (key == null) return null;
HashEntry entry = getEntry(key);
return entry == null ? null : entry.getValue();
}
public HashEntry getEntry(Object key) {
HashEntry entry = table[index(key)];
while (entry != null) {
if (entry.getKey().hashCode() == key.hashCode() && (entry.getKey() == key || entry.getKey().equals(key))) {
return entry;
}
entry = entry.getNext();
}
return null;
}
public void remove(Object key) {
if (key == null) return;
int index = index(key);
HashEntry pre = null;
HashEntry entry = table[index];
while (entry != null) {
if (entry.getKey().hashCode() == key.hashCode() && (entry.getKey() == key || entry.getKey().equals(key))) {
if (pre == null) table[index] = entry.getNext();
else pre.setNext(entry.getNext());
//如果成功找到并删除,修改size
size--;
return;
}
pre = entry;
entry = entry.getNext();
}
}
public boolean containsKey(Object key) {
if (key == null) return false;
return getEntry(key) != null;
}
public int size() {
return this.size;
}
public void clear() {
for (int i = 0; i < table.length; i++) {
table[i] = null;
}
this.size = 0;
}
@Override
public String toString() {
StringBuilder sb = new StringBuilder();
sb.append(String.format("size:%s capacity:%s resize:%s\n\n", size, table.length, resize));
for (HashEntry entry : table) {
while (entry != null) {
sb.append(entry.getKey() + ":" + entry.getValue() + "\n");
entry = entry.getNext();
}
}
return sb.toString();
}
}
class HashEntry {
private final Object key;
private Object value;
private HashEntry next;
public HashEntry(Object key, Object value, HashEntry next) {
this.key = key;
this.value = value;
this.next = next;
}
public Object getKey() {
return key;
}
public Object getValue() {
return value;
}
public void setValue(Object value) {
this.value = value;
}
public HashEntry getNext() {
return next;
}
public void setNext(HashEntry next) {
this.next = next;
}
}