Python实例:政府工作报告词云
问题分析
- 直观理解政策文件
-
实例解析
- 基本思路
- 步骤1:读取文件、分词整理
步骤2:设置并输出词云
步骤3:观察结构,优化迭代
具体代码:
#GovRptWordCloudv1.py
import jieba
import wordcloud
f = open("新时代中国特色社会主义.txt","r",encoding="utf-8")
t = f.read()
f.close()
ls = jieba.lcut(t)
txt = " ".join(ls)
w = wordcloud.WordCloud(font_path = "msyh.ttc",\
width=1000,height = 700,\
background_color="white",\
)
w.generate(txt)
w.to_file("grwordcloud.png")
运行结果:
在这段代码中生成词云对象时,增加max_words=15
结果为:
更有形的词云
具体代码:
#GovRptWordCloudv2.py
import jieba
import wordcloud
from scipy.misc import imread
mask = imread("chinamap.jpg")
excludes = { }
f = open("新时代中国特色社会主义.txt", "r", encoding="utf-8")
t = f.read()
f.close()
ls = jieba.lcut(t)
txt = " ".join(ls)
w = wordcloud.WordCloud(\
width = 1000, height = 700,\
background_color = "white",
font_path = "msyh.ttc", mask = mask
)
w.generate(txt)
w.to_file("grwordcloudm.png")
运行结果: