我想问这个服务器的表,但我不知道该怎么做.基本上我试图让
Eclipse程序在运行程序时打印出类似的内容:
<tr><td>
<pre>
<b>2011 Phases of the Moon</b>
Universal Time
New Moon First Quarter Full Moon Last Quarter
d h m d h m d h m d h m
Jan 4 9 03 Jan 12 11 31 Jan 19 21 21 Jan 26 12 57
Feb 3 2 31 Feb 11 7 18 Feb 18 8 36 Feb 24 23 26
Mar 4 20 46 Mar 12 23 45 Mar 19 18 10 Mar 26 12 07
Apr 3 14 32 Apr 11 12 05 Apr 18 2 44 Apr 25 2 47
May 3 6 51 May 10 20 33 May 17 11 09 May 24 18 52
Jun 1 21 03 Jun 9 2 11 Jun 15 20 14 Jun 23 11 48
Jul 1 8 54 Jul 8 6 29 Jul 15 6 40 Jul 23 5 02
Jul 30 18 40 Aug 6 11 08 Aug 13 18 57 Aug 21 21 54
Aug 29 3 04 Sep 4 17 39 Sep 12 9 27 Sep 20 13 39
Sep 27 11 09 Oct 4 3 15 Oct 12 2 06 Oct 20 3 30
Oct 26 19 56 Nov 2 16 38 Nov 10 20 16 Nov 18 15 09
Nov 25 6 10 Dec 2 9 52 Dec 10 14 36 Dec 18 0 48
Dec 24 18 06
</pre>
</td></tr>
这是我到目前为止:
import java.util.Scanner;
import java.io.*;
import java.net.*;
/**
This program shows the moon phase table for the given year
*/
public class MoonPhaseTable
{
public static void main(String[] args) throws IOException
{
Scanner in = new Scanner(System.in); //read year number from user
System.out.print("Please enter the year (e.g. 1977): ");
int year = in.nextInt();
// Build the URL string and open a URLConnection
// Be sure to set the year on the URL string!!!
URL moonphase = new URL("http://aa.usno.navy.mil/cgi-bin/aa_moonphases.pl?year=2015/");
URLConnection mp = moonphase.openConnection();
// Get the connection's input stream, and make a Scanner for it
BufferedReader r = new BufferedReader(new InputStreamReader(mp.getInputStream()));
Scanner s = new Scanner(moonphase.openStream());
String inputLine;
boolean printingTable = false;
while ((inputLine = r.readLine()) !=null) {
// Read input lines from the scanner into the String named line.
if (printingTable) {
// Check if the line contains the </table> end tag -- if seen, turn off printing
System.out.print("CONTAINS </table>");
s.close();
}
if (printingTable) {
// If inside the Table, print the line.
// Optionally clean up any unwanted tags, such as
// <pre>, <tr>, <td>, <b> before printing.
System.out.print(inputLine);
}
if (!printingTable) {
// Check if the line contains the <table ...> start tag -- if seen, turn on printing
System.out.print(inputLine);
}
}
}
}
现在,它打印整个源代码..我只想要如上所示的代码.我知道我没有检查该行是否正确表格块,但我不知道该怎么做.如果有人有任何建议,将不胜感激.我想我必须向aa.usno.navy.mil服务器询问月球表,但又一次……我迷路了.
最佳答案 在循环内部,您需要根据您收集的输入中的匹配来计算printingTable.
这样的东西可能有助于计算printingTable:
printingTable = inputLine.contains("table");
这非常粗糙,但对于这种情况可能会运作良好.
现在,如果printingTable为true或false,则代码为打印行,并且不会从其初始值false更新printingTable.
或者,有一些非常好的HTML HTML5解析库,如Validator.nu HTML Parser:
https://about.validator.nu/htmlparser/
它提供SAX,DOM和其他解析模式. DOM可能是最好的,因为您可以轻松地获取表格的整个子树,并逐个选择元素.
我已经用它来在工作中对相当混乱的数据源进行Web挖掘,这使得它更易于管理.我会请求允许发布我在2011年写回的网络指南.