Hashtable源码分析

2019年6月27日 213次阅读来源: xingfeng_coder

Hashtable和HashMap一样，都是一个哈希表，不允许键和值为null，该类是一个线程安全的，每个方法都加了synchronized关键字。下面是该类的继承关系图：
《Hashtable源码分析》
从上图可以看到，Hashtable继承自Dictionary类，而HashMap继承自AbstractMap，所以这两个类的祖宗就是不一样的。这篇文章主要介绍Hashtable和HashMap的异同点。
对于HashMap不了解的朋友可以参考下面两篇文章：
1. JDK1.8 HashMap源码分析
2. JDk1.7 HashMap源码分析

构造器

底层结构

JDK1.8中HashMap的底层结构是数组+链表+红黑树，JDK1.7中HashMap的底层结构是数组+链表；而Hashtable的底层结构是数组+链表，本文的源码均基于JDK.1.8进行分析。
由于Hashtable和JDK1.7中的HashMap都采用了数组+链表的结构，那么本文将以JDK1.8中的Hashtable和JDK1.7中的HashMap进行比较相同和不同的地方。

初始容量和加载因子

Hashtable和HashMap一样，都有初始容量和加载因子两个影响性能的参数，并且加载因子默认也是0.75。

构造方法

Hashtable的构造方法如下：

 public Hashtable(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal Capacity: "+
                                               initialCapacity);
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal Load: "+loadFactor);

        if (initialCapacity==0)
            initialCapacity = 1;
        this.loadFactor = loadFactor;
        table = new Entry<?,?>[initialCapacity];
        threshold = (int)Math.min(initialCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
    }


    public Hashtable(int initialCapacity) {
        this(initialCapacity, 0.75f);
    }


    public Hashtable() {
        this(11, 0.75f);
    }


    public Hashtable(Map<? extends K, ? extends V> t) {
        this(Math.max(2*t.size(), 11), 0.75f);
        putAll(t);
    }

可以看到，Hashtable和HashMap的构造方法相同的是，均是对初始容量和加载因子完成了设置；不同的地方有2点：
1. HashMap对底层数组采取的懒加载，即当执行第一次插入时才会创建数组；而Hashtable在初始化时就创建了数组；
2. HashMap中数组的默认初始容量是16，并且必须的是2的指数倍数；而Hashtable中默认的初始容量是11，并且不要求必须是2的指数倍数。

基本操作

Hashtable作为哈希表，基本操作有插入一个键值对、按照键查询值以及删除键值对。下面逐个分析。

put(K k,V v)

put的实现如下：

 public synchronized V put(K key, V value) {
        //值不允许为null
        if (value == null) {
            throw new NullPointerException();
        }

        // Makes sure the key is not already in the hashtable.
        Entry<?,?> tab[] = table;
        //得到键的hash
        int hash = key.hashCode();
        //得到对应hash在数组中的桶索引
        int index = (hash & 0x7FFFFFFF) % tab.length;
        @SuppressWarnings("unchecked")
        //得到桶中链表头节点
        Entry<K,V> entry = (Entry<K,V>)tab[index];
        //从头开始遍历
        for(; entry != null ; entry = entry.next) {
            //一旦hash值相等并且键相等，替换旧值
            if ((entry.hash == hash) && entry.key.equals(key)) {
                V old = entry.value;
                entry.value = value;
                return old;
            }
        }
        //如果没有找到相同键，那么添加新节点
        addEntry(hash, key, value, index);
        return null;
    }

下面看一下addEntry方法，其实现如下：

private void addEntry(int hash, K key, V value, int index) {
        modCount++;

        Entry<?,?> tab[] = table;
        //如果尺寸超过了阈值，进行rehash
        if (count >= threshold) {
            // Rehash the table if the threshold is exceeded
            rehash();

            tab = table;
            hash = key.hashCode();
            index = (hash & 0x7FFFFFFF) % tab.length;
        }

        // Creates the new entry.
        @SuppressWarnings("unchecked")
        Entry<K,V> e = (Entry<K,V>) tab[index];
        tab[index] = new Entry<>(hash, key, value, e);
        count++;
    }

从上面的代码可以看到，当插入一个节点时，如果哈希表的尺寸已经达到了扩容的阈值，那么进行rehash()，之后再将节点插入到链表的头部，这一点和HashMap是一样的，即新节点总是位于桶的头结点。
下面看一下rehash()方法， rehash()方法首先将数组扩容，然后再将数据从旧哈希表中移到新哈希表中，其实现如下：

 protected void rehash() {
        int oldCapacity = table.length;
        Entry<?,?>[] oldMap = table;

        // 扩容，newCapacity=2*oldCapacity+1
        int newCapacity = (oldCapacity << 1) + 1;
        if (newCapacity - MAX_ARRAY_SIZE > 0) {
            if (oldCapacity == MAX_ARRAY_SIZE)
                // Keep running with MAX_ARRAY_SIZE buckets
                return;
            newCapacity = MAX_ARRAY_SIZE;
        }
        Entry<?,?>[] newMap = new Entry<?,?>[newCapacity];

        modCount++;
        threshold = (int)Math.min(newCapacity * loadFactor, MAX_ARRAY_SIZE + 1);
        table = newMap;

        //rehash
        for (int i = oldCapacity ; i-- > 0 ;) {
            for (Entry<K,V> old = (Entry<K,V>)oldMap[i] ; old != null ; ) {
                Entry<K,V> e = old;
                old = old.next;

                int index = (e.hash & 0x7FFFFFFF) % newCapacity;
                e.next = (Entry<K,V>)newMap[index];
                newMap[index] = e;
            }
        }
    }

rehash()方法主要分为两步：
1. 扩容。扩容策略为newCapacity=2*oldCapacity+1
2. rehash。将节点rehash之后再当做头节点接到新的桶中

在上面的put方法中可以看到很多点与JDK1.7中不同的地方:
1. Hashtable的put()是线程安全的，而HashMap的put()方法不是线程安全的
2. HashMap中键和值均允许为null；Hashtable中均不允许
3. 计算hash的方式不同。Hashtable中使用键的哈希码作为哈希值，而HashMap中的哈希值将根据键的哈希值经过计算得到，其计算方式如下：

final int hash(Object k) {
        int h = hashSeed;//默认为0
        if (0 != h && k instanceof String) {
            return sun.misc.Hashing.stringHash32((String) k);
        }

        h ^= k.hashCode();

        // This function ensures that hashCodes that differ only by
        // constant multiples at each bit position have a bounded
        // number of collisions (approximately 8 at default load factor).
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

并且HashMap中当hashSeed变化时，同一个键得到的hash值将会不一样。
4. 得到数组中桶的方式不一样。由于HashMap中桶的个数必须是2的指数倍数，因此得到桶索引处的方法为：

static int indexFor(int h, int length) {
        // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
        return h & (length-1);
    }

该方法就相当于对长度求模；而Hashtable中当hash值小于0x7FFFFFFF时和HashMap中一样，当大于0x7FFFFFFF时则不同。
5. 扩容策略。Hashtable扩容时策略是newCapacity=oldCapacity*2+1；而HashMap是newCapacity=2*oldCapacity

HashMap和Hashtable中put方法的相同点有如下2点：
1. 新节点总是作为桶的头节点
2. rehash时桶中的链表顺序会颠倒

get(K k)操作

Hashtable的get()方法用于根据键得到值，其实现如下：

public synchronized V get(Object key) {
        Entry<?,?> tab[] = table;
        int hash = key.hashCode();
        int index = (hash & 0x7FFFFFFF) % tab.length;
        for (Entry<?,?> e = tab[index] ; e != null ; e = e.next) {
            if ((e.hash == hash) && e.key.equals(key)) {
                return (V)e.value;
            }
        }
        return null;
    }

可以看到该实现和HashMap是相同的，只不过是计算hash以及得到桶中索引的方式不同而已。、

remove(Object o)操作

Hashtable的remove()方法用于根据键删除键值对，其实现如下：

public synchronized V remove(Object key) {
        Entry<?,?> tab[] = table;
        //计算hash值
        int hash = key.hashCode();
        //得到桶的索引
        int index = (hash & 0x7FFFFFFF) % tab.length;
        @SuppressWarnings("unchecked")
        Entry<K,V> e = (Entry<K,V>)tab[index];
        //遍历
        for(Entry<K,V> prev = null ; e != null ; prev = e, e = e.next) {
            //如果匹配,修改节点
            if ((e.hash == hash) && e.key.equals(key)) {
                modCount++;
                if (prev != null) {
                    prev.next = e.next;
                } else {
                    tab[index] = e.next;
                }
                count--;
                V oldValue = e.value;
                e.value = null;
                return oldValue;
            }
        }
        return null;
    }

可以看到删除节点的操作是先计算hash，得到桶的索引，然后再遍历桶中的链表，这和HashMap中的实现一样。

迭代器

由于Hashtable没有实现Iterable接口，所以不能foreach循环遍历其键值，这是因为Hashtable从JDK1.0起就存在了，不过可以使用keys()方法得到键的集合，使用values()得到值的集合。keys()方法的实现如下：

 public synchronized Enumeration<K> keys() {
        return this.<K>getEnumeration(KEYS);
    }

其中Enumeration是一种类似于Iterator的接口，可以使用该类进行遍历。下面看一下getEnumeration(int type)方法，其实现如下：

private <T> Enumeration<T> getEnumeration(int type) {
        if (count == 0) {
            return Collections.emptyEnumeration();
        } else {
            return new Enumerator<>(type, false);
        }
    }

可以看到，在哈希表不为空时，返回Enumerator对象，该类的定义如下：

 private class Enumerator<T> implements Enumeration<T>, Iterator<T> {
        Entry<?,?>[] table = Hashtable.this.table;
        int index = table.length;
        Entry<?,?> entry;
        Entry<?,?> lastReturned;
        int type;

        /** * Indicates whether this Enumerator is serving as an Iterator * or an Enumeration. (true -> Iterator). */
        boolean iterator;

        /** * The modCount value that the iterator believes that the backing * Hashtable should have. If this expectation is violated, the iterator * has detected concurrent modification. */
        protected int expectedModCount = modCount;

        Enumerator(int type, boolean iterator) {
            this.type = type;
            this.iterator = iterator;
        }

        public boolean hasMoreElements() {
            Entry<?,?> e = entry;
            int i = index;
            Entry<?,?>[] t = table;
            /* Use locals for faster loop iteration */
            while (e == null && i > 0) {
                e = t[--i];
            }
            entry = e;
            index = i;
            return e != null;
        }

        @SuppressWarnings("unchecked")
        public T nextElement() {
            Entry<?,?> et = entry;
            int i = index;
            Entry<?,?>[] t = table;
            /* Use locals for faster loop iteration */
            while (et == null && i > 0) {
                et = t[--i];
            }
            entry = et;
            index = i;
            if (et != null) {
                Entry<?,?> e = lastReturned = entry;
                entry = e.next;
                return type == KEYS ? (T)e.key : (type == VALUES ? (T)e.value : (T)e);
            }
            throw new NoSuchElementException("Hashtable Enumerator");
        }

        // Iterator methods
        public boolean hasNext() {
            return hasMoreElements();
        }

        public T next() {
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();
            return nextElement();
        }

        public void remove() {
            if (!iterator)
                throw new UnsupportedOperationException();
            if (lastReturned == null)
                throw new IllegalStateException("Hashtable Enumerator");
            if (modCount != expectedModCount)
                throw new ConcurrentModificationException();

            synchronized(Hashtable.this) {
                Entry<?,?>[] tab = Hashtable.this.table;
                int index = (lastReturned.hash & 0x7FFFFFFF) % tab.length;

                @SuppressWarnings("unchecked")
                Entry<K,V> e = (Entry<K,V>)tab[index];
                for(Entry<K,V> prev = null; e != null; prev = e, e = e.next) {
                    if (e == lastReturned) {
                        modCount++;
                        expectedModCount++;
                        if (prev == null)
                            tab[index] = e.next;
                        else
                            prev.next = e.next;
                        count--;
                        lastReturned = null;
                        return;
                    }
                }
                throw new ConcurrentModificationException();
            }
        }
    }

该类既实现了Enumeration接口，也实现了Iterator接口，构造方法中指明了是否使用Iterator接口的方法。Enumeration接口的方法有：

public interface Enumeration<E> {

    boolean hasMoreElements();


    E nextElement();
}

而Iterator接口的定义如下：

public interface Iterator<E> {

    boolean hasNext();


    E next();

    default void remove() {
        throw new UnsupportedOperationException("remove");
    }
}

可以看到该两个接口基本是一致的。在Enumerator的实现中可以发现，除了remove()方法，Iterator接口的另外两个方法都是使用的Enumeration接口的实现，而remove()方法只有在iterator参数为true时才能使用，否则抛出异常。在keys()的调用过程中可以发现传入的iterator这个参数为false，那么什么时候这个参数会为true呢？
在使用values()方法得到值的集合时，iterator参数会为true，答案如下：

 public Collection<V> values() {
        if (values==null)
            values = Collections.synchronizedCollection(new ValueCollection(),
                                                        this);
        return values;
    }

由于values()的返回值是一个Collection，必须支持foreach遍历，并且由于Hashtable是线程安全的，所以values使用了Collections.synchronziedCollection()方法对ValueCollection就行了同步封装。ValueCollection类的定义如下：

private class ValueCollection extends AbstractCollection<V> {
        public Iterator<V> iterator() {
            return getIterator(VALUES);
        }
        public int size() {
            return count;
        }
        public boolean contains(Object o) {
            return containsValue(o);
        }
        public void clear() {
            Hashtable.this.clear();
        }
    }

主要关注iterator()方法，内部调用了getIterator()方法，该方法如下：

private <T> Iterator<T> getIterator(int type) {
        if (count == 0) {
            return Collections.emptyIterator();
        } else {
            return new Enumerator<>(type, true);
        }
    }

可以看到这时Enumerator的第二个参数为true。

总结

本文的Hashtable的代码是基于JDK1.8的，而与之比较的是1.7中的HashMap,因为它们的底层结构都是数组+链表。虽然大的结构上两个类相同，但是还是有主要的几点不同：
1. Hashtable是线程安全的；而HashMap不是线程安全的
2. 构造器的区别。Hashtable默认初始容量为11，HashMap为16
3. put方法的区别，主要包括hash的计算，桶中索引的计算，rehash

    原文作者：xingfeng_coder
    原文地址: https://blog.csdn.net/qq_19431333/article/details/76165464
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。