【原】【爬虫系列】简要获取一下知乎的最热门话题相关主题及描述信息-APISpace

【原】【爬虫系列】简要获取一下知乎的最热门话题相关主题及描述信息

最近看下爬虫方面，用一些通用的做一些小的实验。都是比较基础的代码，高手请跳过。

说明

这里只是实现获取知乎每日/每月最热问题的一个基本的小功能（class="question_link" href="/question/30359991/answer/401771701" target="_blank" data-id="4326359" data-za-element-name="Title">心算可以算出羽毛球的落点吗？

稍微分析一下可以看出，我们只需要拿到所有class=“question_link”的a标签就可以了。

代码

# coding:utf-8#!/usr/bin/python# @Time :18-5-29 下午2:12# @Author :Hao Chuang# @Site :# @Wechat :nianhuaiju# @File :zhihu-hot.py# @Software :PyCharm Community Edition# zhihu.pyimport requestsfrom bs4 import BeautifulSoupurl = '= '= '= {'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.62 Safari/537.36'}def get_url_hot(): soup = url_set(url) # getTitle for link in soup.find_all('a', class_='question_link'): print(link.text) printstar(50) # get hot number for hotnum in soup.find_all('a', class_='zm-item-vote-count js-expand js-vote-count'): print(hotnum.text) printstar(50) # get author for author in soup.find_all('a', class_='author-link'): print (author.text) printstar(50) # get user type for usertype in soup.find_all('span', class_="badge-summary"): print (usertype.text) printstar(50) # get user desc for usertitle in soup.find_all('span', class_="badge-summary"): print (usertitle.text) printstar(50) # 获取热门话题内容 for contextdesc in soup.find_all('div', class_="zh-summary summary clearfix"): print (contextdesc.text) printstar(50) # 评论数量 for commentnum in soup.find_all('a', class_="meta-item toggle-comment js-toggleCommentBox"): print (commentnum.text) def get_url_day_hot(): soup = url_set(url_day) # getTitle for link in soup.find_all('a', class_='question_link'): print(link.text) printstar(50) # get hot number for hotnum in soup.find_all('a', class_='zm-item-vote-count js-expand js-vote-count'): print(hotnum.text) printstar(50) # get author for author in soup.find_all('a', class_='author-link'): print (author.text) printstar(50) # get user type for usertype in soup.find_all('span', class_="badge-summary"): print (usertype.text) printstar(50) # get user desc for usertitle in soup.find_all('span', class_="badge-summary"): print (usertitle.text) printstar(50) # 获取热门话题内容 for contextdesc in soup.find_all('div', class_="zh-summary summary clearfix"): print (contextdesc.text) printstar(50) # 评论数量 for commentnum in soup.find_all('a', class_="meta-item toggle-comment js-toggleCommentBox"): print (commentnum.text) def get_url_month_hot(): soup = url_set(url_month) # getTitle for link in soup.find_all('a', class_='question_link'): print(link.text) printstar(50) # get hot number for hotnum in soup.find_all('a', class_='zm-item-vote-count js-expand js-vote-count'): print(hotnum.text) printstar(50) # get author for author in soup.find_all('a', class_='author-link'): print (author.text) printstar(50) # get user type for usertype in soup.find_all('span', class_="badge-summary"): print (usertype.text) printstar(50) # get user desc for usertitle in soup.find_all('span', class_="badge-summary"): print (usertitle.text) printstar(50) # 获取热门话题内容 for contextdesc in soup.find_all('div', class_="zh-summary summary clearfix"): print (contextdesc.text) printstar(50) # 评论数量 for commentnum in soup.find_all('a', class_="meta-item toggle-comment js-toggleCommentBox"): print (commentnum.text) printstar(50)def printstar(num): print '*' * numdef url_set(url_): soup = BeautifulSoup(requests.get(url_, headers=headers).text, 'html.parser') return soupif __name__ == '__main__': printstar(100) get_url_hot() printstar(100) get_url_day_hot() printstar(100) get_url_month_hot()

运行

我使用Pycharm，直接CTRL+SHIFT+F10,运行，当然，你也可以在命令行中运行：python zhihu.py

结果如下：

这只是一个比较简单的示例，还可以优化，后续我在继续搞下

赠人玫瑰手留余香

我们曾如此渴望命运的波澜，到最后才发现：人生最曼妙的风景，竟是内心的淡定与从容……我们曾如此期盼外界的认可，到最后才知道：世界是自己的，与他人毫无关系！-杨绛先生

c语言sscanf函数的用法是什么

284 2022-09-19

【原】【爬虫系列】简要获取一下知乎的最热门话题相关主题及描述信息

c语言sscanf函数的用法是什么

r语言清空数组的方法是什么

c语言一维数组怎么快速排列

推荐文章

api接口有哪几种分类及功能

什么是API接口?API接口简单介绍

短信API接口概述，短信API接口的优势

7款快递物流的物流查询API工具，物流快递查询API接口怎么对接？

企业四要素: 了解企业经营成功的关键

什么是语音验证码?,语音验证码平台有哪些

全国工商查询系统怎么查企业名录

哪些平台提供实名认证的接口？

PHP如何调用API接口?

如何使用百度天气预报API接口?

最近发表

热评文章

数据接口api（数据接口API开发平台）

数据开放接口api（数据服务api开发）

Python爬虫教程：爬取酷狗音乐（python爬取

hbuilder怎么更改字体大小和颜色

直播平台api接口 - 构建卓越的直播平台

实时股票数据api接口（股票实时行情api接口）