说明
非常感谢nick老师的提点
老师博客:https://home.cnblogs.com/u/nickchen121/
项目连接
1.码云:https://gitee.com/pythonywy/html_to_md (码云由于上传文件大小限制现在exe不是最新的,最新的再github上)
2.github:https://github.com/a568972484/html_to_md
功能介绍
- 功能一:批量爬取博客园首页的所有随笔字典并保存JSON文件,且随笔全部转成MD格式文件
- 功能二:输入指定随笔网址把随笔内容转成MD并且保存
- 功能三:爬取某个分目录下博客
由于不同博客具有不同的见状性`要根据博客能让进行适当的修改就可以使用此程序
程序没有加入多进程
与多线程
进去增加博客园的负担
爬取内容请不要用做商业用途
初衷主要是为了帮助博主把已上传的随笔下载至本地方便修改
更新日志
2019.7.20
增加了功能
功能介绍:爬取某个分目录下博客
版本升级至5.0,增加了可视化界面可视化界面exe程序,增加了见状性,
只需下载exe运行即可
温馨提示:
程序由可能会被流氓杀毒软件屏蔽请自行恢复
绝对无毒的,没有添加任何恶意信息
运行程序第一功能和第三功能会因为博客数量多出现卡顿,由于本人对程序理解还不深刻没能找到解决办法,请大家见谅请不要关闭程序,结束后会自动出现数据的
都是自学的一些模块可能会有点理解不到位请大家见谅,需要原代码的解压密码私聊我就好了.
核心代码在’core_code.py’中注释都加全了
再次强调
该程序只为了帮助学习
码云名称:YWY
码云链接:https://gitee.com/pythonywy
github_id:a568972484
github_url:https://github.com/a568972484
作者博客:小小咸鱼ywy
博客链接:`https://www.cnblogs.com/pythonywy
希望得到大家相关体验,好进行后续的改进,谢谢
description
Function introduction
- function 1: batch access to all the essay dictionaries on the homepage of blog garden and save JSON files, and convert all the essays into MD format files
- function 2: input the specified essay website to convert the essay contents to MD and save
Since different blogs have different perspectives, you can use this program to make the appropriate changes according to the blog
Do not add ‘multi – process’ and ‘multi – threaded’ to add to the burden of the blog park
Crawl content ‘please do not use it for commercial purposes’
The original intention is to help bloggers download the uploaded essays to the local site for easy modification
Run ‘run.py’ when in use
update log
2019.7.20
Added functionality
Function description: crawl a subdirectory under the blog
Version 5.0, added visual interface visual interface exe program, increased visibility
Just run exe
Tips:
Program by may be rogue antivirus software shield please restore
Absolutely non-toxic, without adding any malicious information
The first function and the third function of running the program will appear because of the number of blog card, because I understand the program is not deep did not find a solution, please forgive me please do not close the program, the end will automatically appear data
Some modules are self-taught may not understand a little bit in place, please forgive me, need the original code to extract password private chat on me.
The core code is commented out in ‘core_code.py’
again
Code cloud name :YWY
Yards cloud link: https://gitee.com/pythonywy
Github_id: a568972484
github_url:https://github.com/a568972484
Author’s blog: little salted fish ywy
Blog links: ` https://www.cnblogs.com/pythonywy
hope to get relevant experience, so as to carry out subsequent improvements,thanks