在使用htmlunit抓取网页时,我偶尔会注意到这些警告会淹没控制台输出.
Jul 24, 2011 5:12:59 PM com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter warning
WARNING: warning: message=[Calling eval() with anything other than a primitive string value
will simply return the value. Is this what you intended?] sourceName=[http://ad.doubleclick.net/adj/N5762.morningstar.com/B5553006.25;sz=728x90;click0=http://ads.morningstar.com/RealMedia/ads/click_lx.ads/www.morningstar.com/quicktake/fund/L34/648978540/TopLeft/Morningstar/JPM_FRpt_728x90_Jul_3827448/Fund_Reports_728x90_content.html/656d5477595534723465554144664a2b?;ord=648978540?] line=[356] lineSource=[null] lineOffset=[0]
有没有办法,我可以让htmlunit忽略javascript
> http://ad. *
> http://ads. *
甚至只是
> http://ad.doubleclick.net
> http://ads.morningstar.com
同样,有没有办法让htmlunit只解释包含特定子字符串或匹配正则表达式的网页上的javascript?
最佳答案 您可以通过实现自己的javascript来删除不需要的JavaScript
ScriptPreProcessor.您的ScriptPreProcessor可以检测您不想执行的jsvascript,而不是从网站上删除它.
我还没有尝试过,但可能会奏效.