【譯】12.2.4 解析狀態 Parse state - HTML Standard

HTML
Living Standard — Last Updated 20 August 2017
12.2.4 Parse state
Parts of this specification are © Copyright 2004-2014 Apple Inc., Mozilla Foundation, and Opera Software ASA.
You are granted a license to use, reproduce and create derivative works of this document.

12.2.4.1 插入模式 The insertion mode

The insertion mode is a state variable that controls the primary operation of the tree construction stage.

insertion mode 是一個狀態變量,它控制樹在構建階段的主要操作。

Initially, the insertion mode is “initial”. It can change to “before html”, “before head”, “in head”, “in head noscript”, “after head”, “in body”, “text”, “in table”, “in table text”, “in caption”, “in column group”, “in table body”, “in row”, “in cell”, “in select”, “in select in table”, “in template”, “after body”, “in frameset”, “after frameset”, “after after body”, and “after after frameset” during the course of the parsing, as described in the tree construction stage. The insertion mode affects how tokens are processed and whether CDATA sections are supported.

最初,insertion modeinitial。在解析過程中,它可以改變為 before htmlbefore headin headin head noscriptafter headin bodytextin tablein table textin captionin column groupin table bodyin rowin cellin selectin select in tablein templateafter bodyin framesetafter framesetafter after bodyafter after frameset,正如 樹構造 階段中所描述的。insertion mode 影響如何處理 tokens ,以及是否支持 CDATA 區段。

Several of these modes, namely “in head”, “in body”, “in table”, and “in select”, are special, in that the other modes defer to them at various times. When the algorithm below says that the user agent is to do something “using the rules for the m insertion mode”, where m is one of these modes, the user agent must use the rules described under the m insertion mode’s section, but must leave the insertion mode unchanged unless the rules in m themselves switch the insertion mode to a new value.

in heaiin bodyin tableselect,上述幾種模式是特殊的,因為其他模式在不同時候對他們進行響應。當下面的算法表明用戶代理是做某事「使用規則到達插入模式m」,這里的m是上述特殊模式之一,用戶代理必需使用在下面 m 插入模式的章節中描述的規則,但必需保持插入模式不變,除非該規則在 m 自身切換 insertion mode 為新值。

When the insertion mode is switched to “text” or “in table text”, the original insertion mode is also set. This is the insertion mode to which the tree construction stage will return.

當插入模式切換為 textin table text 時,也設置了原始插入模式。這是樹構建階段將返回的插入模式。

Similarly, to parse nested template elements, a stack of template insertion modes is used. It is initially empty. The current template insertion mode is the insertion mode that was most recently added to the stack of template insertion modes. The algorithms in the sections below will push insertion modes onto this stack, meaning that the specified insertion mode is to be added to the stack, and pop insertion modes from the stack, which means that the most recently added insertion mode must be removed from the stack.

類似的,使用一個模版插入模式的堆棧,來解析嵌套 template 元素。它最初是空的。當前插入模式是最近添加到插入模式堆棧的插入模式。下面章節中的算法將插入模式 push 到這個堆棧中,這意味著指定的插入模式將添加到堆棧中;並且從堆棧中 pop 插入模式,這意味著必需從堆棧中移除最近添加的插入模式。

When the steps below require the UA to reset the insertion mode appropriately, it means the UA must follow these steps:

當下面的步驟要求用戶代理適當的重置插入模式,意味著用戶代理必需遵循這些步驟:

  1. Let last be false.
    lastfalse
  2. Let node be the last node in the stack of open elements.
    node 為打開元素堆棧的最後一個節點。
  3. Loop: If node is the first node in the stack of open elements, then set last to true, and, if the parser was originally created as part of the HTML fragment parsing algorithm (fragment case), set node to the context element passed to that algorithm.
    Loop:如果 node 是打開元素堆棧的第一個節點,那麼最後設置為 true ,並且,如果解析器最初是作為 HTML 片段解析算法(fragment case)的一部分創建的,那麼 node 設置為傳遞給該算法的上下文元素。
  4. If node is a select element, run these substeps:
    如果 node 是一個 select 元素,運行這些子步驟:

    1. If last is true, jump to the step below labeled done.
      如果 lasttrue,跳到下面的步驟標記 done
    2. Let ancestor be node.
      ancestornode
    3. Loop: If ancestor is the first node in the stack of open elements, jump to the step below labeled done.
      Loop:如果 ancestor 是打開元素堆棧的第一個節點,跳到下面的步驟標記 done
    4. Let ancestor be the node before ancestor in the stack of open elements.
      ancestor 為打開元素堆棧之前的 ancestor
    5. If ancestor is a template node, jump to the step below labeled done.
      如果 ancestor 是一個 template 節點,跳到下面的步驟標記 done
    6. If ancestor is a table node, switch the insertion mode to “in select in table” and abort these steps.
      如果 ancestor 是一個 table 節點,將插入模式切換為 in select in table,并終止這些步驟。
    7. Jump back to the step labeled loop.
      跳轉回步驟標記 Loop
    8. Done: Switch the insertion mode to “in select” and abort these steps.
      Done:切換插入模式為 in select,并終止這些步驟。
  5. If node is a td or th element and last is false, then switch the insertion mode to “in cell” and abort these steps.
    如果 nodetdth 元素,並且 lastfalse,那麼切換插入模式為 in cell,并終止這些步驟。
  6. If node is a tr element, then switch the insertion mode to “in row” and abort these steps.
    如果 nodetr 元素,那麼切換插入模式為 in row,并終止這些步驟。
  7. If node is a tbody, thead, or tfoot element, then switch the insertion mode to “in table body” and abort these steps.
    如果 nodetbodytheadtfoot 元素,那麼切換插入模式為 in table body,并終止這些步驟。
  8. If node is a caption element, then switch the insertion mode to “in caption” and abort these steps.
    如果 nodecaption 元素,那麼切換插入模式為 in caption,并終止這些步驟。
  9. If node is a colgroup element, then switch the insertion mode to “in column group” and abort these steps.
    如果 nodecolgroup 元素,那麼切換插入模式為 in column group,并終止這些步驟。
  10. If node is a table element, then switch the insertion mode to “in table” and abort these steps.
    如果 nodetabla 元素,那麼切換插入模式為 in table,并終止這些步驟。
  11. If node is a template element, then switch the insertion mode to the current template insertion mode and abort these steps.
    如果 nodetemplate 元素,那麼切換插入模式為當前模版插入模式,并終止這些步驟。
  12. If node is a head element and last is false, then switch the insertion mode to “in head” and abort these steps.
    如果 nodehead 元素,並且 lastfalse,那麼切換插入模式為 in haed,并終止這些步驟。
  13. If node is a body element, then switch the insertion mode to “in body” and abort these steps.
    如果 nodebody 元素,那麼切換插入模式為 in body,并終止這些步驟。
  14. If node is a frameset element, then switch the insertion mode to “in frameset” and abort these steps. (fragment case)
    如果 nodeframeset 元素,那麼切換插入模式為 in frameset,并終止這些步驟。(fragment case)
  15. If node is an html element, run these substeps:
    如果 nodehtml 元素,運行這些子步驟:

    1. If the head element pointer is null, switch the insertion mode to “before head” and abort these steps. (fragment case)
      如果 head 元素指針為 null,切換插入模式為 before head,并終止這些步驟。(fragment case)
    2. Otherwise, the head element pointer is not null, switch the insertion mode to “after head” and abort these steps.
      否則,該 head 元素指針不為 null,切換插入模式為 after head,并終止這些步驟。
  16. If last is true, then switch the insertion mode to “in body” and abort these steps. (fragment case)
    如果 lasttrue,那麼切換插入模式為 in body,并終止這些步驟。(fragment case)
  17. Let node now be the node before node in the stack of open elements.
    設現在的 node 為打開元素堆棧中的節點的之前的 node
  18. Return to the step labeled loop.
    回到步驟標籤 Loop

12.2.4.2 打開元素的堆棧 The stack of open elements

Initially, the stack of open elements is empty. The stack grows downwards; the topmost node on the stack is the first one added to the stack, and the bottommost node of the stack is the most recently added node in the stack (notwithstanding when the stack is manipulated in a random access fashion as part of the handling for misnested tags).

最初,打開元素的堆棧是空的。堆棧向下生長;堆棧最頂部的 node 是第一個添加到堆棧的節點,並且堆棧最底部的 node 是最近添加到堆棧的節點(儘管在處理錯誤嵌套的標籤時,堆棧以隨機的訪問方式控制)。

Note: The “before html” insertion mode creates the html document element, which is then added to the stack.
Note: 在 before html 插入模式下創建 html 文檔元素,然後將其添加到堆棧中。

Note: In the fragment case, the stack of open elements is initialized to contain an html element that is created as part of that algorithm. (The fragment case skips the “before html” insertion mode.)
Note: 在碎片容器中,開放元素堆棧已被初始化為包含一個 html 元素,這是作為它的算法的一部分創建的(碎片容器跳過了 before html插入模式)。

The html node, however it is created, is the topmost node of the stack. It only gets popped off the stack when the parser finishes.

無論如何,都將創建html 節點,並且它將是堆棧最頂部的節點。只有當解析完成,它才會從堆棧中彈出。

The current node is the bottommost node in this stack of open elements.

當前節點是在這個打開元素堆棧中最底部的的節點。

The adjusted current node is the context element if the parser was created by the HTML fragment parsing algorithm and the stack of open elements has only one element in it (fragment case); otherwise, the adjusted current node is the current node.

如果解析器是在 HTML 碎片解析算法中創建的,並且打開元素堆棧中只有一個元素,那麼校正后的當前節點為上下文元素;否則,校正后的當前節點就是當前節點

Elements in the stack of open elements fall into the following categories:

在打開元素堆棧中的元素分為下列類別:

  1. Special 特殊的

    The following elements have varying levels of special parsing rules: HTML’s address, applet, area, article, aside, base, basefont, bgsound, blockquote, body, br, button, caption, center, col, colgroup, dd, details, dir, div, dl, dt, embed, fieldset, figcaption, figure, footer, form, frame, frameset, h1, h2, h3, h4, h5, h6, head, header, hgroup, hr, html, iframe, img, input, keygen, li, link, listing, main, marquee, menu, meta, nav, noembed, noframes, noscript, object, ol, p, param, plaintext, pre, script, section, select, source, style, summary, table, tbody, td, template, textarea, tfoot, th, thead, title, tr, track, ul, wbr, xmp; MathML mi, MathML mo, MathML mn, MathML ms, MathML mtext, and MathML annotation-xml; and SVG foreignObject, SVG desc, and SVG title.

    以下元素擁有不同程度的特殊解析規則:HTML 的 addressapplet、areaarticle、asidebasebasefontbgsoundblockquotebodybrbuttoncaptioncentercolcolgroupdddetailsdirdivdldtembedfieldsetfigcaptionfigurefooterformframeframeseth1h2h3h4h5h6headheaderhgrouphrhtmliframeimginputkeygenlilinklistingmainmarqueemenumetanavnoembednoframesnoscriptobjectolpparamplaintextprescriptsectionselectsourcestylesummarytabletbodytdtemplatetextareatfootththeadtitletrtrackulwbrxmpMathML miMathML moMathML mnMathML msMathML mtextMathML annotation-xml;以及 SVG foreignObjectSVG descSVG title

    Note: An image start tag token is handled by the tree builder, but it is not in this list because it is not an element; it gets turned into an img element.
    Note: 在樹構造中會處理image 起始標籤,但它不在這個列表中,因為它不是一個元素;它變成了 img 元素。

  2. Formatting 格式化

    The following HTML elements are those that end up in the list of active formatting elements: a, b, big, code, em, font, i, nobr, s, small, strike, strong, tt, and u.

    下列 HTML 元素在現役格式化元素的列表中結束:abbigcodeemfontinobrssmallstrikestrongttu

  3. Ordinary 普通的

    All other elements found while parsing an HTML document.

    在解析 HTML 文檔時發現的所有其他元素。

    Typically, the special elements have the start and end tag tokens handled specifically, while ordinary elements’ tokens fall into “any other start tag” and “any other end tag” clauses, and some parts of the tree builder check if a particular element in the stack of open elements is in the special category. However, some elements (e.g., the option element) have their start or end tag tokens handled specifically, but are still not in the special category, so that they get the ordinary handling elsewhere.
    通常,當普通元素在「任何其他起始標籤」和「任何其他結束標籤」之間作為子句,特殊元素的起始標籤和結束標籤令牌會進行特殊處理,並且樹構造器的某些部分會檢查特定元素在打開元素堆棧中是否是屬於特殊類別。然而,某些元素(例如,option 元素)有特殊的起始標籤或結束標籤令牌處理,但它仍然不在特殊類別中,這是為了在其他地方得到普通處理。

The stack of open elements is said to have an element target node in a specific scope consisting of a list of element types list when the following algorithm terminates in a match state:

當以下算法在匹配狀態終止時,該打開元素堆棧被認為在特定作用域中存在元素目標節點,算法包含一個元素類型的列表 list

  1. Initialize node to be the current node (the bottommost node of the stack).
    初始化 node 為當前節點(堆棧中最底部的節點)。
  2. If node is the target node, terminate in a match state.
    如果 node 是目標節點,終止於匹配狀態。
  3. Otherwise, if node is one of the element types in list, terminate in a failure state.
    否則,如果 nodelist 中的元素類型之一,終止於失敗狀態。
  4. Otherwise, set node to the previous entry in the stack of open elements and return to step 2. (This will never fail, since the loop will always terminate in the previous step if the top of the stack — an html element — is reached.)
    否則,設置 node 為打開元素堆棧中的前一個元素,并返回到步驟 2 。(這永遠不會失敗,因為如果到達了堆棧的頂部 —— 一個 html 元素,這個循環將在前一個步驟終止。)

The stack of open elements is said to have a particular element in scope when it has that element in the specific scope consisting of the following element types:

當下列元素類型作為 list 時,符合在特定作用域中存在元素目標節點,該打開元素堆棧被認為在作用域中存在特定的元素

  • applet
  • caption
  • html
  • table
  • td
  • th
  • marquee
  • object
  • template
  • MathML mi
  • MathML mo
  • MathML mn
  • MathML ms
  • MathML mtext
  • MathML annotation-xml
  • SVG foreignObject
  • SVG desc
  • SVG title

The stack of open elements is said to have a particular element in list item scope when it has that element in the specific scope consisting of the following element types:

當下列元素類型作為 list 時,符合在特定作用域中存在元素目標節點,該打開元素堆棧被認為在列表條目作用域中存在特定的元素

  • All the element types listed above for the has an element in scope algorithm.
    算法 在作用域中有特定的元素 列出的所有元素。
  • ol in the HTML namespace
    HTML 命名空間中的 ol
  • ul in the HTML namespace
    HTML 命名空間中的 ul

The stack of open elements is said to have a particular element in button scope when it has that element in the specific scope consisting of the following element types:

當下列元素類型作為 list 時,符合在特定作用域中存在元素目標節點,該打開元素堆棧被認為在按鈕作用域中存在特定的元素

  • All the element types listed above for the has an element in scope algorithm.
    算法 在作用域中有特定的元素 列出的所有元素。
  • button in the HTML namespace
    HTML 命名空間中的 button

The stack of open elements is said to have a particular element in table scope when it has that element in the specific scope consisting of the following element types:

當下列元素類型作為 list 時,符合在特定作用域中存在元素目標節點,該打開元素堆棧被認為在表格作用域中存在特定的元素

  • html in the HTML namespace
    HTML 命名空間中的 html
  • table in the HTML namespace
    HTML 命名空間中的 table
  • template in the HTML namespace
    HTML 命名空間中的 template

The stack of open elements is said to have a particular element in select scope when it has that element in the specific scope consisting of all element types except the following:

當除下列元素類型以外的所有元素類型作為 list 時,符合在特定作用域中存在元素目標節點,該打開元素堆棧被認為在選擇作用域中存在特定的元素

  • optgroup in the HTML namespace
    HTML 命名空間中的 optgroup
  • option in the HTML namespace
    HTML 命名空間中的 option

Nothing happens if at any time any of the elements in the stack of open elements are moved to a new location in, or removed from, the Document tree. In particular, the stack is not changed in this situation. This can cause, amongst other strange effects, content to be appended to nodes that are no longer in the DOM.

在任何時候,打開元素堆棧中的任何元素移動到一個新的位置或者從文檔樹中移除,都不會觸發任何操作。要注意的是,在這種情況下,堆棧沒有變動。這可能導致一些奇怪的效果,內容被附加在DOM中已不存在的節點。

Note: In some cases (namely, when closing misnested formatting elements), the stack is manipulated in a random-access fashion.
Note: 在某些情況下(即,關閉錯誤嵌套的格式化元素時),堆棧是以隨機存取的方式進行操作的。

12.2.4.3 現役格式化元素的列表 The list of active formatting elements

Initially, the list of active formatting elements is empty. It is used to handle mis-nested formatting element tags.

起初,現役格式化元素的列表為空。它是用於處理錯誤嵌套的格式化元素標籤。

The list contains elements in the formatting category, and markers. The markers are inserted when entering applet, object, marquee, template, td, th, and caption elements, and are used to prevent formatting from “leaking” into applet, object, marquee, template, td, th, and caption elements.

該列表包含格式化類別中的元素,以及標記。當進入 appletobjectmarqueetemplatetdthcaption 元素時附加該標記,這用於防止格式化「洩漏」到 appletobjectmarqueetemplatetdthcaption 元素。

In addition, each element in the list of active formatting elements is associated with the token for which it was created, so that further elements can be created for that token if necessary.

此外,現役格式化元素列表中的每個元素都與創建它的 token 關聯,所以當必要時可以為該 token 創建進一步的元素。

When the steps below require the UA to push onto the list of active formatting elements an element element, the UA must perform the following steps:

但下文的步驟要求用戶代理將元素 element 加入到現役格式化元素的列表中時,用戶代理必需執行以下步驟:

  1. If there are already three elements in the list of active formatting elements after the last marker, if any, or anywhere in the list if there are no markers, that have the same tag name, namespace, and attributes as element, then remove the earliest such element from the list of active formatting elements. For these purposes, the attributes must be compared as they were when the elements were created by the parser; two elements have the same attributes if all their parsed attributes can be paired such that the two attributes in each pair have identical names, namespaces, and values (the order of the attributes does not matter).

    如果在現役格式化元素列表中的最後一個標記后存在三個具有與 element 標籤名稱、命名空間、屬性都一樣的元素,那麼從現役格式化元素列表中移除第一個這樣的元素;如果不存在標記,那麼不限定三個相同元素在列表中的位置。為了達成這些目的,必需像解析器創建元素時那樣去比較屬性;如果兩個元素的所有屬性經過解析都能配對(屬性的-順序並不重要),使得每一對中的兩個屬性具有相同的名稱、命名空間和值,認為兩個元素具有相同的屬性。

    Note: This is the Noah’s Ark clause. But with three per family instead of two.
    Note: 這是諾亞方舟的條例。但是每家庭三個,而不是兩個。

  2. Add element to the list of active formatting elements.

    添加元素到現役格式化元素列表。

When the steps below require the UA to reconstruct the active formatting elements, the UA must perform the following steps:

當下文的步驟要求用戶代理重建現役格式化元素時,用戶代理必需執行以下步驟:

  1. If there are no entries in the list of active formatting elements, then there is nothing to reconstruct; stop this algorithm.
    如果現役格式化列表中沒有條目,那也沒有什麼可供重建的;終止這個算法。
  2. If the last (most recently added) entry in the list of active formatting elements is a marker, or if it is an element that is in the stack of open elements, then there is nothing to reconstruct; stop this algorithm.
    如果現役格式化元素列表中的最後一個(最近添加的)條目是一個標記,或者如果它是打開元素堆棧中的元素,那麼也沒有什麼可重建的,終止這個算法。
  3. Let entry be the last (most recently added) element in the list of active formatting elements.
    entry 為現役格式化元素列表中的最後一個(最近添加的)元素。
  4. Rewind: If there are no entries before entry in the list of active formatting elements, then jump to the step labeled create.
    Rewind: 如果在現役格式化元素列表中,沒有元素在 entry 之前,那麼跳轉到步驟標籤 create
  5. Let entry be the entry one earlier than entry in the list of active formatting elements.
    entry 為在現役格式化元素列表中,比 entry 的早一個加入的元素。
  6. If entry is neither a marker nor an element that is also in the stack of open elements, go to the step labeled rewind.
    如果 entry 既不是一個標記,也不是一個在打開元素堆棧中的元素,跳轉到步驟標籤 Rewind
  7. Advance: Let entry be the element one later than entry in the list of active formatting elements.
    Advance: 設 entry 為在現役格式化元素列表中,比 entry 後一個加入的元素。
  8. Create: Insert an HTML element for the token for which the element entry was created, to obtain new element.
    Create: 為創建 entry 的令牌插入一個 HTML 元素,得到 new element
  9. Replace the entry for entry in the list with an entry for new element.
    new element 的條目替換列表中 entry 的條目。
  10. If the entry for new element in the list of active formatting elements is not the last entry in the list, return to the step labeled advance.
    如果 new element 的條目在現役格式化元素列表中不是列表最後的條目,返回到步驟標籤 Advance

This has the effect of reopening all the formatting elements that were opened in the current body, cell, or caption (whichever is youngest) that haven’t been explicitly closed.

這將重新打開所以在當前主體、單元格和標題(最年輕的)中打開的所有元素,這些元素沒有被明確的關閉。

Note: The way this specification is written, the list of active formatting elements always consists of elements in chronological order with the least recently added element first and the most recently added element last (except for while steps 7 to 10 of the above algorithm are being executed, of course).
Note: 這個規範的編寫方式,現役格式化元素列表的元素永遠按時間順序排序,並且較前添加的元素在前,最近添加的元素在後(當然,執行上述算法的 7 至 10 步時是例外的)。

When the steps below require the UA to clear the list of active formatting elements up to the last marker, the UA must perform the following steps:

當下文的步驟要求用戶代理將現役格式化元素列表清除至最後一個標記處時,用戶代理必需執行以下步驟:

  1. Let entry be the last (most recently added) entry in the list of active formatting elements.
    entry 為現役格式化元素列表中最後(最近添加)的條目。
  2. Remove entry from the list of active formatting elements.
    從現役格式化元素列表中移除 entry
  3. If entry was a marker, then stop the algorithm at this point. The list has been cleared up to the last marker.
    如果 entry 是一個標記,在這裡停止算法。該列表已被清除至最後一個標記。
  4. Go to step 1.
    回到步驟 1。

12.2.4.4 元素的指針 The element pointers

Initially, the head element pointer and the form element pointer are both null.

最初,head 元素指針from 元素指針都是無效的。

Once a head element has been parsed (whether implicitly or explicitly) the head element pointer gets set to point to this node.

一旦一個 head 元素被解析(不論是隱式或是顯式),head 元素指針將被設置為指向這個節點。

The form element pointer points to the last form element that was opened and whose end tag has not yet been seen. It is used to make form controls associate with forms in the face of dramatically bad markup, for historical reasons. It is ignored inside template elements.

form 元素指針指向最後一個打開的並且未見到結束標籤的 form 元素。由於歷史原因,它被用作使表單控件與表單相關聯。它在 template 元素內部會被忽略。

12.2.4.5 其他解析狀態標記 Other parsing state flags

The scripting flag is set to “enabled” if scripting was enabled for the Document with which the parser is associated when the parser was created, and “disabled” otherwise.

如果在解析器創建時與解析器相關聯的 Document 中啟用腳本,那麼 scripting flag 被設置為 “enabled”,否則被設置為 “disabled”。

Note: The scripting flag can be enabled even when the parser was originally created for the HTML fragment parsing algorithm, even though script elements don’t execute in that case.
Note: 即使在為 HTML 碎片解析算法創建解析器時,也可以將 scripting flag 設置為 enabled,哪怕在這種情況下 script 元素不執行。

The frameset-ok flag is set to “ok” when the parser is created. It is set to “not ok” after certain tokens are seen.

在創建解析器時,frameset-ok flag 被設置為 “ok”。當看到某些特定的令牌時,它被設為 “not ok”。

    原文作者:Pandorym
    原文地址: https://segmentfault.com/a/1190000010757422
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞