如何理解JDK源代码中java.util.Hashtable的hashCode函数

2023年6月7日 264次阅读

正如纪录片中所述,“此代码滥用loadFactor字段来执行双重任务作为hashCode in progress标志,以免恶化空间性能.负负载因子表示哈希码计算正在进行中.”

怎么理解这一段？

public synchronized int hashCode() {
    /*
     * This code detects the recursion caused by computing the hash code
     * of a self-referential hash table and prevents the stack overflow
     * that would otherwise result.  This allows certain 1.1-era
     * applets with self-referential hash tables to work.  This code
     * abuses the loadFactor field to do double-duty as a hashCode
     * in progress flag, so as not to worsen the space performance.
     * A negative load factor indicates that hash code computation is
     * in progress.
     */
    int h = 0;
    if (count == 0 || loadFactor < 0)
        return h;  // Returns zero

    loadFactor = -loadFactor;  // Mark hashCode computation in progress
    Entry[] tab = table;
    for (int i = 0; i < tab.length; i++)
        for (Entry e = tab[i]; e != null; e = e.next)
            h += e.key.hashCode() ^ e.value.hashCode();
    loadFactor = -loadFactor;  // Mark hashCode computation complete

return h;

最佳答案使用加载因子作为进行中检查的目的是确保如果有一个循环链引用回哈希表本身,代码将不会陷入无限循环.例如,设想一个Hashtable< String,Hashtable>类型的哈希表,即从字符串到其他哈希表的映射.然后,表中的条目可能包含对同一哈希表本身的引用;或者,它可能指向另一个相同类型的哈希表,然后指向同一个表.因为散列代码递归地计算键和值的哈希码,然后将它们组合以产生最终的哈希码,如果它没有检测到循环引用(图中的周期),它将陷入无限循环.

当代码遇到循环引用时,它会注意到这一点,因为加载因子将为负数,表明已经遇到哈希表.在这种情况下,它将通过返回0而不是进一步递归来中断循环.

我在XEmacs上做了很多工作,它的Lisp解释器中有类似的哈希代码.它使用了一个不同的技巧：它有一个递归深度值,它被传递到hashCode函数的等价物中,并在每次函数递归到另一个对象时递增.如果深度超过一定数量,则拒绝进一步递减.这比Java的技巧要脆弱,但在Java中是不可能的,因为hashCode函数的签名是固定的,并且其中没有递归深度参数.