jQuery 源码系列（六）sizzle 编译

2019年8月31日 106次阅读来源: songjz

欢迎来我的专栏检察系列文章。

compile

讲了这么久的 Sizzle，总觉得差了那末一口气，关于一个 selector，我们把它天生 tokens，举行优化，优化的步骤包括去头和天生 seed 鸠合。关于这些种子鸠合，我们晓得末了的婚配效果是来自于鸠合中的一部分，彷佛接下来的使命也已明白：对种子举行过滤（或许称其为婚配）。

《jQuery 源码系列（六）sizzle 编译》

婚配的历程实在很简单，就是对 DOM 元素举行推断，而且弱是那种一代关联（>）或邻近兄弟关联（+），不满足，就完毕，若为子女关联（space）或许兄弟关联（~），会举行屡次推断，要么找到一个准确的，要么完毕，不过仍须要斟酌回溯题目。

比方div > div.seq h2 ~ p，已对应的把它们划分红 tokens，假如每一个 seed 都走一遍流程明显太贫苦。一种比较合理的要领就是对应每一个可推断的 token 天生一个闭包函数，一致举行查找。

Expr.filter 是用来天生婚配函数的，它也许长如许：

Expr.filter = {
  "ID": function(id){...},
  "TAG": function(nodeNameSelector){...},
  "CLASS": function(className){...},
  "ATTR": function(name, operator, check){...},
  "CHILD": function(type, what, argument, first, last){...},
  "PSEUDO": function(pseudo, argument){...}
}

看两个例子，一切都懂了：

Expr.filter["ID"] = function( id ) {
  var attrId = id.replace( runescape, funescape );
  //这里返回一个函数
  return function( elem ) {
    return elem.getAttribute("id") === attrId;
  };
};

对 tokens 剖析，让发明其 type 为 ID，则把其 id 保存，并返回一个搜检函数，参数为 elem，用于推断该 DOM 的 id 是不是与 tokens 的 id 一致。这类做法的优点是，编译一次，实行屡次。

那末，是如许的吗？我们再来看看其他例子：

Expr.filter["TAG"] = function( nodeNameSelector ) {
  var nodeName = nodeNameSelector.replace( runescape, funescape ).toLowerCase();
  return nodeNameSelector === "*" ?
    //返回一个函数
    function() { return true; } :
    // 参数为 elem
    function( elem ) {
      return elem.nodeName && elem.nodeName.toLowerCase() === nodeName;
    };
};

Expr.filter["ATTR"] = function( name, operator, check ) {
  // 返回一个函数
  return function( elem ) {
    var result = Sizzle.attr( elem, name );

    if ( result == null ) {
      return operator === "!=";
    }
    if ( !operator ) {
      return true;
    }

    result += "";

    return operator === "=" ? result === check :
      operator === "!=" ? result !== check :
      operator === "^=" ? check && result.indexOf( check ) === 0 :
      operator === "*=" ? check && result.indexOf( check ) > -1 :
      operator === "$=" ? check && result.slice( -check.length ) === check :
      operator === "~=" ? ( " " + result.replace( rwhitespace, " " ) + " " ).indexOf( check ) > -1 :
      operator === "|=" ? result === check || result.slice( 0, check.length + 1 ) === check + "-" :
      false;
  };
},

末了的返回效果：

input[type=button] 属性 type 范例为 button；
input[type!=button] 属性 type 范例不即是 button；
input[name^=pre] 属性 name 以 pre 开首；
input[name*=new] 属性 name 中包括 new；
input[name$=ed] 属性 name 以 ed 末端；
input[name=~=new] 属性 name 有效空格星散的 new；
input[name|=zh] 属性 name 要么即是 zh，要么以 zh 开首且背面有干系字符 -。

所以关于一个 token，即天生了一个闭包函数，该函数吸收的参数为一个 DOM，用来推断该 DOM 元素是不是是相符 token 的约束前提，比方 id 或 className 等等。假如将多个 token （即 tokens）都这么来处置惩罚，会取得一个特地用来推断的函数数组，如许子关于 seed 中的每一个元素，就能够用这个函数数组对其父元素或兄弟节点挨个推断，效力大大提拔，即所谓的编译一次，屡次运用。

compile 源码

直接贴上 compile 函数代码，这里会有 matcherFromTokens 和 matcherFromGroupMatchers 这两个函数，也一并引见了。

var compile = function(selector, match) {
  var i,setMatchers = [],elementMatchers = [],
    cached = compilerCache[selector + " "];
  // 推断有无缓存，彷佛每一个函数都邑推断
  if (!cached) {
    if (!match) {
      // 推断 match 是不是天生 tokens
      match = tokenize(selector);
    }
    i = match.length;
    while (i--) {
      // 这里将 tokens 交给了这个函数
      cached = matcherFromTokens(match[i]);
      if (cached[expando]) {
        setMatchers.push(cached);
      } else {
        elementMatchers.push(cached);
      }
    }

    // 放到缓存
    cached = compilerCache(
      selector,
      // 这个函数天生终究的婚配器
      matcherFromGroupMatchers(elementMatchers, setMatchers)
    );

    // Save selector and tokenization
    cached.selector = selector;
  }
  return cached;
};

编译 compile 函数貌似很简单，来看 matcherFromTokens：

//
function matcherFromTokens(tokens) {
  var checkContext,matcher,j,len = tokens.length,leadingRelative = Expr.relative[tokens[0].type],
    implicitRelative = leadingRelative || Expr.relative[" "],
    i = leadingRelative ? 1 : 0,
    // 确保元素都能找到
    // addCombinator 就是对 Expr.relative 举行推断
    /*
      Expr.relative = {
        ">": { dir: "parentNode", first: true },
        " ": { dir: "parentNode" },
        "+": { dir: "previousSibling", first: true },
        "~": { dir: "previousSibling" }
      };
     */
    matchContext = addCombinator(
      function(elem) {
        return elem === checkContext;
      },implicitRelative,true),
    matchAnyContext = addCombinator(
      function(elem) {
        return indexOf(checkContext, elem) > -1;
      },implicitRelative,true),
    matchers = [
      function(elem, context, xml) {
        var ret = !leadingRelative && (xml || context !== outermostContext) || ((checkContext = context).nodeType ? matchContext(elem, context, xml) : matchAnyContext(elem, context, xml));
        // Avoid hanging onto element (issue #299)
        checkContext = null;
        return ret;
      }
    ];

  for (; i < len; i++) {
    // 处置惩罚 "空 > ~ +"
    if (matcher = Expr.relative[tokens[i].type]) {
      matchers = [addCombinator(elementMatcher(matchers), matcher)];
    } else {
      // 处置惩罚 ATTR CHILD CLASS ID PSEUDO TAG，filter 函数在这里
      matcher = Expr.filter[tokens[i].type].apply(null, tokens[i].matches);

      // Return special upon seeing a positional matcher
      // 伪类会把selector分两部分
      if (matcher[expando]) {
        // Find the next relative operator (if any) for proper handling
        j = ++i;
        for (; j < len; j++) {
          if (Expr.relative[tokens[j].type]) {
            break;
          }
        }
        return setMatcher(
          i > 1 && elementMatcher(matchers),
          i > 1 && toSelector(
              // If the preceding token was a descendant combinator, insert an implicit any-element `*`
              tokens
                .slice(0, i - 1)
                .concat({value: tokens[i - 2].type === " " ? "*" : ""})
            ).replace(rtrim, "$1"),
          matcher,
          i < j && matcherFromTokens(tokens.slice(i, j)),
          j < len && matcherFromTokens(tokens = tokens.slice(j)),
          j < len && toSelector(tokens)
        );
      }
      matchers.push(matcher);
    }
  }

  return elementMatcher(matchers);
}

个中 addCombinator 函数用于天生 curry 函数，来处理 Expr.relative 状况：

function addCombinator(matcher, combinator, base) {
  var dir = combinator.dir, skip = combinator.next, key = skip || dir, checkNonElements = base && key === "parentNode", doneName = done++;

  return combinator.first ? // Check against closest ancestor/preceding element
    function(elem, context, xml) {
      while (elem = elem[dir]) {
        if (elem.nodeType === 1 || checkNonElements) {
          return matcher(elem, context, xml);
        }
      }
      return false;
    } : // Check against all ancestor/preceding elements
    function(elem, context, xml) {
      var oldCache, uniqueCache, outerCache, newCache = [dirruns, doneName];

      // We can't set arbitrary data on XML nodes, so they don't benefit from combinator caching
      if (xml) {
        while (elem = elem[dir]) {
          if (elem.nodeType === 1 || checkNonElements) {
            if (matcher(elem, context, xml)) {
              return true;
            }
          }
        }
      } else {
        while (elem = elem[dir]) {
          if (elem.nodeType === 1 || checkNonElements) {
            outerCache = elem[expando] || (elem[expando] = {});

            // Support: IE <9 only
            // Defend against cloned attroperties (jQuery gh-1709)
            uniqueCache = outerCache[elem.uniqueID] || (outerCache[elem.uniqueID] = {});

            if (skip && skip === elem.nodeName.toLowerCase()) {
              elem = elem[dir] || elem;
            } else if ((oldCache = uniqueCache[key]) && oldCache[0] === dirruns && oldCache[1] === doneName) {
              // Assign to newCache so results back-propagate to previous elements
              return newCache[2] = oldCache[2];
            } else {
              // Reuse newcache so results back-propagate to previous elements
              uniqueCache[key] = newCache;

              // A match means we're done; a fail means we have to keep checking
              if (newCache[2] = matcher(elem, context, xml)) {
                return true;
              }
            }
          }
        }
      }
      return false;
    };
}

个中 elementMatcher 函数用于天生婚配器：

function elementMatcher(matchers) {
  return matchers.length > 1 ? function(elem, context, xml) {
      var i = matchers.length;
      while (i--) {
        if (!matchers[i](elem, context, xml)) {
          return false;
        }
      }
      return true;
    } : matchers[0];
}

matcherFromGroupMatchers 以下：

function matcherFromGroupMatchers(elementMatchers, setMatchers) {
  var bySet = setMatchers.length > 0,
    byElement = elementMatchers.length > 0,
    superMatcher = function(seed, context, xml, results, outermost) {
      var elem,j,matcher,matchedCount = 0,i = "0",unmatched = seed && [],setMatched = [],
        contextBackup = outermostContext,
        // We must always have either seed elements or outermost context
        elems = seed || byElement && Expr.find["TAG"]("*", outermost),
        // Use integer dirruns iff this is the outermost matcher
        dirrunsUnique = dirruns += contextBackup == null ? 1 : Math.random() || 0.1,len = elems.length;

      if (outermost) {
        outermostContext = context === document || context || outermost;
      }

      // Add elements passing elementMatchers directly to results
      // Support: IE<9, Safari
      // Tolerate NodeList properties (IE: "length"; Safari: <number>) matching elements by id
      for (; i !== len && (elem = elems[i]) != null; i++) {
        if (byElement && elem) {
          j = 0;
          if (!context && elem.ownerDocument !== document) {
            setDocument(elem);
            xml = !documentIsHTML;
          }
          while (matcher = elementMatchers[j++]) {
            if (matcher(elem, context || document, xml)) {
              results.push(elem);
              break;
            }
          }
          if (outermost) {
            dirruns = dirrunsUnique;
          }
        }

        // Track unmatched elements for set filters
        if (bySet) {
          // They will have gone through all possible matchers
          if (elem = !matcher && elem) {
            matchedCount--;
          }

          // Lengthen the array for every element, matched or not
          if (seed) {
            unmatched.push(elem);
          }
        }
      }

      // `i` is now the count of elements visited above, and adding it to `matchedCount`
      // makes the latter nonnegative.
      matchedCount += i;

      // Apply set filters to unmatched elements
      // NOTE: This can be skipped if there are no unmatched elements (i.e., `matchedCount`
      // equals `i`), unless we didn't visit _any_ elements in the above loop because we have
      // no element matchers and no seed.
      // Incrementing an initially-string "0" `i` allows `i` to remain a string only in that
      // case, which will result in a "00" `matchedCount` that differs from `i` but is also
      // numerically zero.
      if (bySet && i !== matchedCount) {
        j = 0;
        while (matcher = setMatchers[j++]) {
          matcher(unmatched, setMatched, context, xml);
        }

        if (seed) {
          // Reintegrate element matches to eliminate the need for sorting
          if (matchedCount > 0) {
            while (i--) {
              if (!(unmatched[i] || setMatched[i])) {
                setMatched[i] = pop.call(results);
              }
            }
          }

          // Discard index placeholder values to get only actual matches
          setMatched = condense(setMatched);
        }

        // Add matches to results
        push.apply(results, setMatched);

        // Seedless set matches succeeding multiple successful matchers stipulate sorting
        if (outermost && !seed && setMatched.length > 0 && matchedCount + setMatchers.length > 1) {
          Sizzle.uniqueSort(results);
        }
      }

      // Override manipulation of globals by nested matchers
      if (outermost) {
        dirruns = dirrunsUnique;
        outermostContext = contextBackup;
      }

      return unmatched;
    };

  return bySet ? markFunction(superMatcher) : superMatcher;
}

这个历程太庞杂了，请原谅我没法耐烦的看完。。。

先留名，今后剖析。。。

到此，实在已能够完毕了，但我本着担任的心态，我们再来理一下 Sizzle 全部历程。

Sizzle 虽然自力出去，零丁成一个项目，不过在 jQuery 中的代表就是 jQuery.find 函数，这两个函数实在就是同一个，完整等价的。然后引见 tokensize 函数，这个函数的被称为词法剖析，作用就是将 selector 划分红 tokens 数组，数组每一个元素都有 value 和 type 值。然后是 select 函数，这个函数的功用起着优化作用，去头去尾，并 Expr.find 函数天生 seed 种子数组。

背面的引见就因陋就简了，我自身看的也不少很懂。compile 函数举行预编译，就是对去掉 seed 后剩下的 selector 天生闭包函数，又把闭包函数天生一个大的 superMatcher 函数，这个时候就可用这个 superMatcher(seed) 来处置惩罚 seed 并取得终究的效果。

那末 superMatcher 是什么？

superMatcher

前面就已说过，这才是 compile()() 函数的准确运用要领，而 compile() 的返回值即 superMatcher，无论是引见 matcherFromTokens 还说引见 matcherFromGroupMatchers，其效果都是为了天生超等婚配，然后处置惩罚 seed，这是一个磨练的时候，只要经得住挑选才会留下来。

总结

下面是他人总结的一个流程图：

《jQuery 源码系列（六）sizzle 编译》

第一步

div > p + div.aaron input[type="checkbox"]

从最右侧先经由过程 Expr.find 取得 seed 数组，在这里的 input 是 TAG，所以经由过程 getElementsByTagName() 函数。

第二步

重组 selector，此时撤除 input 以后的 selector：

div > p + div.aaron [type="checkbox"]

第三步

此时经由过程 Expr.relative 将 tokens 依据关联分红严密关联和非严密关联，比方 [“>”, “+”] 就是严密关联，其 first = true。而关于 [” “, “~”] 就黑白严密关联。严密关联在挑选时能够疾速推断。

matcherFromTokens 依据关联编译闭包函数，为四组：

div > 
p + 
div.aaron 
input[type="checkbox"]

编译函数重要借助 Expr.filter 和 Expr.relative。

第四步

将一切的编译闭包函数放到一同，天生 superMatcher 函数。

function( elem, context, xml ) {
    var i = matchers.length;
    while ( i-- ) {
        if ( !matchers[i]( elem, context, xml ) ) {
            return false;
        }
    }
    return true;
}

从右向左，处置惩罚 seed 鸠合，假如有一个不婚配，则返回 false。假如胜利婚配，则申明该 seed 元素是相符挑选前提的，返回给 results。

参考

jQuery 2.0.3 源码剖析Sizzle引擎 – 编译函数（大篇幅）
jQuery 2.0.3 源码剖析Sizzle引擎 – 超等婚配

本文在 github 上的源码地点，欢迎来 star。

欢迎来我的博客交换。

    原文作者：songjz
    原文地址: https://segmentfault.com/a/1190000008379985
    本文转自网络文章，转载此文章仅为分享知识，如有侵权，请联系博主进行删除。