如何通过nltk在Python中从Tree类型转换为String类型？

2019年8月3日 313次阅读

for subtree3 in tree.subtrees():
  if subtree3.label() == 'CLAUSE':
    print(subtree3)
    print subtree3.leaves()

使用此代码,我能够提取树的叶子.哪个是：
[(‘talk’,’VBG’),(‘常常’,’RB’)]某个例子.这是完全正确的.现在我想要将这些Tree元素转换为字符串或列表以进行进一步处理.我怎样才能做到这一点？

我尝试了什么

for subtree3 in tree.subtrees():
  if subtree3.label() == 'CLAUSE':
    print(subtree3)
    print subtree3.leaves()
    fo.write(subtree3.leaves())
fo.close()

但它抛出一个错误：

Traceback (most recent call last):
  File "C:\Python27\Association_verb_adverb.py", line 35, in <module>
    fo.write(subtree3.leaves())
TypeError: expected a character buffer object

我只想将叶子存储在文本文件中.

最佳答案这取决于您的NLTK和Python版本.我想你在nltk.tree模块中引用了
Tree类.如果是这样,请继续阅读.

在您的代码中,确实如此：

> subtree3.leaves()返回“元组列表”对象,
> fo是Python File IO object,fo.write只接收str类型作为参数

你可以用fo.write(str(subtree3.leaves()))打印树叶,因此：

for subtree3 in tree.subtrees():
    if subtree3.label() == 'CLAUSE':
        print(subtree3)
        print subtree3.leaves()
        fo.write(str(subtree3.leaves()))
fo.flush()
fo.close()

并且不要忘记flush()缓冲区.