c# – 我可以使用HtmlAgilityPack在特定标签上拆分HTML文档吗?

例如,我有一堆< tr>标签我想收集.我需要将每个标签分成单独的元素,以便我自己解析.

这可能吗?

标记的一个示例:

<tr class="first-in-year">
  <td class="year">2011</td>

  <td class="img"><a href="/battlefield-3/61-27006/"><img src=
  "http://media.giantbomb.com/uploads/6/63038/1700748-bf3_thumb.jpg" alt=""></a></td>

  <td class="title">
    <a href="/battlefield-3/61-27006/">Battlefield 3</a>

    <p class="deck">Battlefield 3 is DICE's next installment in the franchise and
    will be on PC, PS3 and Xbox 360. The game will feature jets, prone, a
    single-player and co-op campaign, and 64-player multiplayer (on PC). It's due out
    in Fall of 2011.</p>
  </td>

  <td class="date">Expected: Q4 2011</td>

  <td><a href="/pc/60-94/" class="PC">PC</a>, <a href="/xbox-360/60-20/" class=
  "X360">X360</a>, <a href="/playstation-3/60-35/" class="PS3">PS3</a></td>
</tr>

<tr>
  <td class="year"></td>

  <td class="img"><a href="/forza-motorsport-4/61-33400/"><img src=
  "http://media.giantbomb.com/uploads/0/1992/1654849-forza4_thumb.jpg" alt=
  ""></a></td>

  <td class="title">
    <a href="/forza-motorsport-4/61-33400/">Forza Motorsport 4</a>

    <p class="deck">The next installment of Turn 10's racing franchise slated for
    release in Fall 2011. It is set to feature 16 player online races, dynamic race
    conditions, cars from over 80 manufacturers, and compatibility with Kinect, both
    on and off the racetrack.</p>
  </td>

  <td class="date">Expected: Oct 2011</td>

  <td><a href="/xbox-360/60-20/" class="X360">X360</a></td>
</tr>

<tr>
  <td class="year"></td>

  <td class="img"><a href="/max-payne-3/61-23398/"><img src=
  "http://media.giantbomb.com/uploads/0/1400/938434-custom_1237811317319_mp3_poster_thumb.jpg"
  alt=""></a></td>

  <td class="title">
    <a href="/max-payne-3/61-23398/">Max Payne 3</a>

    <p class="deck">The long awaited third instalment in Remedy's beloved series, in
    which an aging Max Payne faces one final chance to redeem himself.</p>
  </td>

  <td class="date">Expected: 2011</td>

  <td><a href="/pc/60-94/" class="PC">PC</a>, <a href="/playstation-3/60-35/" class=
  "PS3">PS3</a>, <a href="/xbox-360/60-20/" class="X360">X360</a></td>
</tr>

所以这个例子中我有三个元素. 🙂

最佳答案 如果这就是你的意思,你不能将它分成标签上的多个HTML文档.您可以选择单个TD元素并单独解析它们.

XPath选择器// td将选择您可以传递给解析方法的所有元素.

HtmlAgilityPack.HtmlDocument doc = LoadHtmlHowever();
doc.DocumentNode.SelectNodes("//td");
点赞