我有一些
Python的经验,但我从未使用过try&除了由于缺乏正规培训而发现错误的功能.
我正在从维基百科中提取一些文章.为此,我有一系列标题,其中一些标题或搜索结果最后没有.我希望页面检索功能只是跳过这几个名称并继续运行脚本.随后是可重现的代码.
import wikipedia
# This one works.
links = ["CPython"]
test = [wikipedia.page(link, auto_suggest=False) for link in links]
test = [testitem.content for testitem in test]
print(test)
#The sequence breaks down if there is no wikipedia page.
links = ["CPython","no page"]
test = [wikipedia.page(link, auto_suggest=False) for link in links]
test = [testitem.content for testitem in test]
print(test)
运行它的库使用这样的方法.通常这将是非常糟糕的做法,但由于这仅用于一次性数据提取,我愿意更改库的本地副本以使其工作.编辑我现在包括完整的功能.
def page(title=None, pageid=None, auto_suggest=True, redirect=True, preload=False):
'''
Get a WikipediaPage object for the page with title `title` or the pageid
`pageid` (mutually exclusive).
Keyword arguments:
* title - the title of the page to load
* pageid - the numeric pageid of the page to load
* auto_suggest - let Wikipedia find a valid page title for the query
* redirect - allow redirection without raising RedirectError
* preload - load content, summary, images, references, and links during initialization
'''
if title is not None:
if auto_suggest:
results, suggestion = search(title, results=1, suggestion=True)
try:
title = suggestion or results[0]
except IndexError:
# if there is no suggestion or search results, the page doesn't exist
raise PageError(title)
return WikipediaPage(title, redirect=redirect, preload=preload)
elif pageid is not None:
return WikipediaPage(pageid=pageid, preload=preload)
else:
raise ValueError("Either a title or a pageid must be specified")
我该怎么做才能只检索没有给出错误的页面.也许有一种方法可以过滤掉列表中出现此错误或某种错误的所有项目.对于不存在的页面,返回“NA”或类似内容将没有问题.在没有通知的情况下跳过它们也没关系.谢谢!
最佳答案 如果页面不存在,函数wikipedia.page将引发wikipedia.exceptions.PageError.那是你想要抓住的错误.
import wikipedia
links = ["CPython","no page"]
test=[]
for link in links:
try:
#try to load the wikipedia page
page=wikipedia.page(link, auto_suggest=False)
test.append(page)
except wikipedia.exceptions.PageError:
#if a "PageError" was raised, ignore it and continue to next link
continue
你必须通过try块包围函数wikipedia.page,所以我担心你不能使用列表理解.