Code Monkey home page Code Monkey logo

spider163's Introduction

spider163 logo

spider163

MIT License pyversions pyversions Build Status

GitHub上最易用的网易云音乐爬虫系统

安装模块

  • 第一步:指定SPIDER163_PATH环境变量,缺省情况下为$HOME/spider163
  • 第二步:把默认配置文件spider163.conf拷贝到SPIDER163_PATH下,并配置数据库
  • 第三步:pip install spider163
  • spider163 --help

历史文档

使用指南

$ spider163 initdb
$ # 根据配置文件的数据库信息自动创建数据库表,删除全部数据通过resetdb实现
$ spider163 resetdb
$ # 重建相关数据库
$ spider163 updatedb
$ # 根据时间重置过期数据重新抓取
$ spider163 classify
$ # 获取已知曲风列表
$ spider163 playlist
$ # 默认下载全部推荐歌单(1000+),也可以通过指定页码去下载(-p=1),以及歌曲风格(--classify=小语种,默认为全部)
$ spider163 mp3 --playlist=2033391777
$ # 默认下载指定歌单列表内的全部包含版权的歌曲
$ spider163 music
$ # 默认下载10个歌单的歌曲数据,也可以通过指定循环大小(-c=2)来下载10 * c 个歌单内歌曲
$ spider163 comment
$ # 默认根据数据库存储的未下载歌曲随机下载一首单曲的评论,也可以通过-c指定需要下载的单曲数量和-s强制指定歌曲id
$ # spider163 comment -c 10 | spider163 comment -s 209115
$ spider163 lyric --count=10
$ # 抓取10首音乐的歌词,可以通过制定歌曲ID抓取特定一首音乐(--song)
$ spider163 search -q="林依晨"
$ # 搜索功能(待完善,暂支持歌曲搜索)
$ spider163 get -s 209115
$ # 阅读歌曲基本信息、歌词、热评
$ spider163 get --playlist 922064582
$ # 获取歌单的基本信息、歌曲等
$ spider163 doc --playlist 922064582
$ # 歌单/歌曲信息汇总成word文档
$ spider163 top50 --playlist 922064582 --username=xxx --password=xxx
$ # 创建TOP 50 歌单

TODO

欢迎关注微信公众账号:程天写代码

guojingcoooool

spider163's People

Contributors

chengyumeng avatar dependabot[bot] avatar iawia002 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spider163's Issues

MP3下载优化建议。

MP3下载过程中,报错就会停止下载。
这个报错可能原因:歌单中的歌曲要单曲购买(即使是会员也得买的),或者没有版权(界面显示为灰色)。
所以我建议:
1、报错的歌曲跳过,继续进行下一首下载。
2、下载之前显示歌单有多少首歌曲。下载完成后,能够统计成功下载的有多少,失败的有多少。

今天不能用了?

2.7.5版本刚更新的时间用的挺好,今天又用出现以下提示,不知是我配置的原因还是网易升级了反爬虫?

[li@localhost ~]$ sudo -i
[sudo] li 的密码:
[root@localhost ~]# spider163 mp3 --playlist=2048302032 --path ./mp3/
正在下载歌曲 卷珠帘-霍尊.mp3
正在下载歌曲 父亲的草原母亲的河-云飞.mp3
执行抓取任务遭遇配置异常: HTTPConnectionPool(host='m10.music.126.net', port=80): Max retries exceeded with url: /20180331093453/859760a1fc5dd33453fa3f5667f8c8fa/ymusic/de9e/8956/813e/e59af448d057c9d8cbacf29fa257bd78.mp3 (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x4ac7810>: Failed to establish a new connection: [Errno 111] Connection refused',))
[root@localhost ~]#

不支持 mariadb 使用最新的 mariadb 镜像,init 报错。

自动生成数据库表出现问题: (_mysql_exceptions.ProgrammingError) (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near 'over VARCHAR(255) DEFAULT 'N', \n\tPRIMARY KEY (id)\n)' at line 8") [SQL: u"\nCREATE TABLE playlist163 (\n\tid INTEGER NOT NULL AUTO_INCREMENT, \n\ttitle VARCHAR(5000) DEFAULT 'System Title', \n\tlink VARCHAR(255) DEFAULT 'No Link', \n\tcnt INTEGER DEFAULT '-1', \n\tdsc VARCHAR(255) DEFAULT 'No Description', \n\tcreate_time TIMESTAMP NULL DEFAULT now(), \n\tover VARCHAR(255) DEFAULT 'N', \n\tPRIMARY KEY (id)\n)\n\n"]

使用 mysql 是可以的,请支持下 mariadb。

小白求问 抓取歌单日志一直显示901?

你好!
想问下,为什么我使用这个命令的时候
python capture.py --module=music --config=spider163.conf --source=cmd --playlist=720308660
在日志中一直显示error 901?
INFO:root:Error 901 : http://music.163.com/playlist?id=720308660

谢谢

可以抓取自己的听歌记录吗?

我觉得网易云的年度报告太差了,不符合我的要求,我想获得自己月听歌记录,季度听歌记录,年听歌记录个总听歌记录之类,然后可以按自己喜欢的方面去做报告。
希望可能增加的功能的有,获得用户的听歌记录,raw data 包括听每一首歌的时间,歌手,专辑,歌词,风格等信息

进阶数据,做一个类似于网易云的总结,歌曲排行,专辑排行(专辑里面所有歌都应该至少被听过一遍,进阶有专辑里面歌曲排行),歌手排行(进阶有歌手里面歌曲排行),风格排行(进阶里面有同风格的歌曲排行),连续单曲循环排行(不能中断,又开始时间,结束时间),歌词排行(进阶可以看到歌词对应的歌曲)
万分感谢~(^__^) 嘻嘻……

账号登录问题

hi, 我看code里面setttings有设置账户密码,但是实际上在抓取comment的时候没有用到账号和密码,所以其实没有用到对么,,直接根据加密方式提交post?

无法安装spider163

spider163无法安装终端显示spider1632.4.11至2.7.6 depends on pprint==0.1是Python3.9.5

抓取评论是配置异常

正在执行抓取歌曲 186016 热门评论计划
执行抓取任务遭遇配置异常: (_mysql_exceptions.ProgrammingError) (1146, "Table 'spider.comment163' doesn't exist") [SQL: u'DELETE FROM comment163 WHERE comment163.song_id = %s'] [parameters: (186016,)]

不知道可不可以手动解决,手动在mysql 中create table。 如果可以的话,table的column name和数据类型是什么。

你好,我已经成功部署到本地了,但是执行spider163 --help老是提示如下错了,配置文件改了很多次也一样,

C:\WINDOWS\system32>spider163 --help
Traceback (most recent call last):
File "H:\python\Python27\Scripts\spider163-script.py", line 11, in
load_entry_point('spider163==2.5.4', 'console_scripts', 'spider163')()
File "h:\python\python27\lib\site-packages\pkg_resources_init_.py", line 565, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "h:\python\python27\lib\site-packages\pkg_resources_init_.py", line 2631, in load_entry_point
return ep.load()
File "h:\python\python27\lib\site-packages\pkg_resources_init_.py", line 2291, in load
return self.resolve()
File "h:\python\python27\lib\site-packages\pkg_resources_init_.py", line 2297, in resolve
module = import(self.module_name, fromlist=['name'], level=0)
File "h:\python\python27\lib\site-packages\spider163\bin\cli.py", line 9, in
from spider163.utils import pysql
File "h:\python\python27\lib\site-packages\spider163\utils\pysql.py", line 10, in
from spider163 import settings
File "h:\python\python27\lib\site-packages\spider163\settings.py", line 5, in
from spider163.utils import config
File "h:\python\python27\lib\site-packages\spider163\utils\config.py", line 11, in
PATH = os.environ.get("HOME") + "/spider163"
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'

》》》》》 spider163.conf
[core]
db=mysql://root:985211yyg@localhost/Local?charset=utf8mb4
port=3306

》》》》》SPIDER163_PATH
H:\python\workspace\song163

执行抓取任务遭遇配置异常: 'comments'

系统 centos7.5
环境 Python 2.7.5
版本 spider163 2.7.6

spider163 get -s 1311319824 --path /download/music --debug

2019-01-03 15:41:29,520 (DEBUG) cement.core.foundation : laying cement for the 'Spider163' application
2019-01-03 15:41:29,520 (DEBUG) cement.core.hook : defining hook 'pre_setup'
2019-01-03 15:41:29,520 (DEBUG) cement.core.hook : defining hook 'post_setup'
2019-01-03 15:41:29,520 (DEBUG) cement.core.hook : defining hook 'pre_run'
2019-01-03 15:41:29,520 (DEBUG) cement.core.hook : defining hook 'post_run'
2019-01-03 15:41:29,520 (DEBUG) cement.core.hook : defining hook 'pre_argument_parsing'
2019-01-03 15:41:29,520 (DEBUG) cement.core.hook : defining hook 'post_argument_parsing'
2019-01-03 15:41:29,520 (DEBUG) cement.core.hook : defining hook 'pre_close'
2019-01-03 15:41:29,520 (DEBUG) cement.core.hook : defining hook 'post_close'
2019-01-03 15:41:29,520 (DEBUG) cement.core.hook : defining hook 'signal'
2019-01-03 15:41:29,520 (DEBUG) cement.core.hook : defining hook 'pre_render'
2019-01-03 15:41:29,520 (DEBUG) cement.core.hook : defining hook 'post_render'
2019-01-03 15:41:29,520 (DEBUG) cement.core.hook : registering hook 'add_handler_override_options' from cement.core.foundation into hooks['post_setup']
2019-01-03 15:41:29,521 (DEBUG) cement.core.hook : registering hook 'handler_override' from cement.core.foundation into hooks['post_argument_parsing']
2019-01-03 15:41:29,521 (DEBUG) cement.core.handler : defining handler type 'extension' (IExtension)
2019-01-03 15:41:29,521 (DEBUG) cement.core.handler : defining handler type 'log' (ILog)
2019-01-03 15:41:29,521 (DEBUG) cement.core.handler : defining handler type 'config' (IConfig)
2019-01-03 15:41:29,521 (DEBUG) cement.core.handler : defining handler type 'mail' (IMail)
2019-01-03 15:41:29,521 (DEBUG) cement.core.handler : defining handler type 'plugin' (IPlugin)
2019-01-03 15:41:29,521 (DEBUG) cement.core.handler : defining handler type 'output' (IOutput)
2019-01-03 15:41:29,521 (DEBUG) cement.core.handler : defining handler type 'argument' (IArgument)
2019-01-03 15:41:29,521 (DEBUG) cement.core.handler : defining handler type 'controller' (IController)
2019-01-03 15:41:29,521 (DEBUG) cement.core.handler : defining handler type 'cache' (ICache)
2019-01-03 15:41:29,521 (DEBUG) cement.core.handler : registering handler '<class 'cement.core.extension.CementExtensionHandler'>' into handlers['extension']['cement']
2019-01-03 15:41:29,521 (DEBUG) cement.core.handler : registering handler '<class 'spider163.bin.cli.VersionController'>' into handlers['controller']['base']
2019-01-03 15:41:29,521 (DEBUG) cement.core.handler : registering handler '<class 'spider163.bin.cli.DatabaseController'>' into handlers['controller']['database']
2019-01-03 15:41:29,522 (DEBUG) cement.core.handler : registering handler '<class 'spider163.bin.cli.SpiderController'>' into handlers['controller']['spider']
2019-01-03 15:41:29,522 (DEBUG) cement.core.handler : registering handler '<class 'spider163.bin.cli.QueryController'>' into handlers['controller']['query']
2019-01-03 15:41:29,522 (DEBUG) cement.core.handler : registering handler '<class 'spider163.bin.cli.WebController'>' into handlers['controller']['web']
2019-01-03 15:41:29,522 (DEBUG) cement.core.handler : registering handler '<class 'spider163.bin.cli.AuthController'>' into handlers['controller']['auth']
2019-01-03 15:41:29,522 (DEBUG) cement.core.foundation : now setting up the 'Spider163' application
2019-01-03 15:41:29,522 (DEBUG) cement.core.foundation : setting up Spider163.extension handler
2019-01-03 15:41:29,522 (DEBUG) cement.core.extension : loading the 'cement.ext.ext_dummy' framework extension
2019-01-03 15:41:29,523 (DEBUG) cement.core.handler : registering handler '<class 'cement.ext.ext_dummy.DummyOutputHandler'>' into handlers['output']['dummy']
2019-01-03 15:41:29,523 (DEBUG) cement.core.handler : registering handler '<class 'cement.ext.ext_dummy.DummyMailHandler'>' into handlers['mail']['dummy']
2019-01-03 15:41:29,523 (DEBUG) cement.core.extension : loading the 'cement.ext.ext_smtp' framework extension
2019-01-03 15:41:29,526 (DEBUG) cement.core.handler : registering handler '<class 'cement.ext.ext_smtp.SMTPMailHandler'>' into handlers['mail']['smtp']
2019-01-03 15:41:29,526 (DEBUG) cement.core.extension : loading the 'cement.ext.ext_plugin' framework extension
2019-01-03 15:41:29,527 (DEBUG) cement.core.handler : registering handler '<class 'cement.ext.ext_plugin.CementPluginHandler'>' into handlers['plugin']['cement']
2019-01-03 15:41:29,527 (DEBUG) cement.core.extension : loading the 'cement.ext.ext_configparser' framework extension
2019-01-03 15:41:29,528 (DEBUG) cement.core.handler : registering handler '<class 'cement.ext.ext_configparser.ConfigParserConfigHandler'>' into handlers['config']['configparser']
2019-01-03 15:41:29,528 (DEBUG) cement.core.extension : loading the 'cement.ext.ext_logging' framework extension
2019-01-03 15:41:29,528 (DEBUG) cement.core.handler : registering handler '<class 'cement.ext.ext_logging.LoggingLogHandler'>' into handlers['log']['logging']
2019-01-03 15:41:29,528 (DEBUG) cement.core.extension : loading the 'cement.ext.ext_argparse' framework extension
2019-01-03 15:41:29,530 (DEBUG) cement.core.handler : registering handler '<class 'cement.ext.ext_argparse.ArgparseArgumentHandler'>' into handlers['argument']['argparse']
2019-01-03 15:41:29,530 (DEBUG) cement.core.foundation : adding signal handler <function cement_signal_handler at 0x7f233f892140> for signal 15
2019-01-03 15:41:29,530 (DEBUG) cement.core.foundation : adding signal handler <function cement_signal_handler at 0x7f233f892140> for signal 2
2019-01-03 15:41:29,530 (DEBUG) cement.core.foundation : adding signal handler <function cement_signal_handler at 0x7f233f892140> for signal 1
2019-01-03 15:41:29,530 (DEBUG) cement.core.foundation : setting up Spider163.config handler
2019-01-03 15:41:29,530 (DEBUG) cement.core.config : config file '/etc/Spider163/Spider163.conf' does not exist, skipping...
2019-01-03 15:41:29,531 (DEBUG) cement.core.config : config file '/root/.Spider163.conf' does not exist, skipping...
2019-01-03 15:41:29,531 (DEBUG) cement.core.config : config file '/root/.Spider163/config' does not exist, skipping...
2019-01-03 15:41:29,531 (DEBUG) cement.core.foundation : setting up Spider163.mail handler
2019-01-03 15:41:29,531 (DEBUG) cement.core.handler : merging config defaults from '<cement.ext.ext_dummy.DummyMailHandler object at 0x7f2338835a10>' into section 'mail.dummy'
2019-01-03 15:41:29,531 (DEBUG) cement.core.foundation : no cache handler defined, skipping.
2019-01-03 15:41:29,531 (DEBUG) cement.core.foundation : setting up Spider163.log handler
2019-01-03 15:41:29,531 (DEBUG) cement.core.handler : merging config defaults from '<cement.ext.ext_logging.LoggingLogHandler object at 0x7f2338835bd0>' into section 'log.logging'
2019-01-03 15:41:29,532 (DEBUG) cement.ext.ext_logging : logging initialized for 'Spider163' using LoggingLogHandler
2019-01-03 15:41:29,532 (DEBUG) cement.core.foundation : setting up Spider163.plugin handler
2019-01-03 15:41:29,532 (DEBUG) cement.ext.ext_plugin : plugin config dir /etc/Spider163/plugins.d does not exist.
2019-01-03 15:41:29,532 (DEBUG) cement.ext.ext_plugin : plugin config dir /root/.Spider163/plugins.d does not exist.
2019-01-03 15:41:29,532 (DEBUG) cement.core.foundation : setting up Spider163.arg handler
2019-01-03 15:41:29,533 (DEBUG) cement.core.foundation : setting up Spider163.output handler
2019-01-03 15:41:29,533 (DEBUG) cement.core.foundation : setting up application controllers
2019-01-03 15:41:29,534 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.VersionController object at 0x7f2338846310>' into section 'controller.base'
2019-01-03 15:41:29,534 (DEBUG) cement.core.hook : running hook 'post_setup' (<function add_handler_override_options at 0x7f233f892050>) from cement.core.foundation
2019-01-03 15:41:29,534 (DEBUG) cement.core.foundation : running pre_run hook
2019-01-03 15:41:29,534 (DEBUG) cement.core.controller : collecting arguments/commands for <spider163.bin.cli.VersionController object at 0x7f2338846310>
2019-01-03 15:41:29,534 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.WebController object at 0x7f23388463d0>' into section 'controller.web'
2019-01-03 15:41:29,534 (DEBUG) cement.core.controller : collecting arguments/commands for <spider163.bin.cli.WebController object at 0x7f23388463d0>
2019-01-03 15:41:29,534 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.DatabaseController object at 0x7f2338846450>' into section 'controller.database'
2019-01-03 15:41:29,535 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.SpiderController object at 0x7f2338846450>' into section 'controller.spider'
2019-01-03 15:41:29,535 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.AuthController object at 0x7f2338846450>' into section 'controller.auth'
2019-01-03 15:41:29,535 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.VersionController object at 0x7f2338846450>' into section 'controller.base'
2019-01-03 15:41:29,535 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.QueryController object at 0x7f2338846450>' into section 'controller.query'
2019-01-03 15:41:29,535 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.DatabaseController object at 0x7f2338846450>' into section 'controller.database'
2019-01-03 15:41:29,535 (DEBUG) cement.core.controller : collecting arguments/commands for <spider163.bin.cli.DatabaseController object at 0x7f2338846450>
2019-01-03 15:41:29,535 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.WebController object at 0x7f2338846490>' into section 'controller.web'
2019-01-03 15:41:29,535 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.SpiderController object at 0x7f2338846490>' into section 'controller.spider'
2019-01-03 15:41:29,536 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.AuthController object at 0x7f2338846490>' into section 'controller.auth'
2019-01-03 15:41:29,536 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.VersionController object at 0x7f2338846490>' into section 'controller.base'
2019-01-03 15:41:29,536 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.QueryController object at 0x7f2338846490>' into section 'controller.query'
2019-01-03 15:41:29,536 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.SpiderController object at 0x7f2338846490>' into section 'controller.spider'
2019-01-03 15:41:29,536 (DEBUG) cement.core.controller : collecting arguments/commands for <spider163.bin.cli.SpiderController object at 0x7f2338846490>
2019-01-03 15:41:29,536 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.WebController object at 0x7f2338846550>' into section 'controller.web'
2019-01-03 15:41:29,536 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.DatabaseController object at 0x7f2338846550>' into section 'controller.database'
2019-01-03 15:41:29,536 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.AuthController object at 0x7f2338846550>' into section 'controller.auth'
2019-01-03 15:41:29,537 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.VersionController object at 0x7f2338846550>' into section 'controller.base'
2019-01-03 15:41:29,537 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.QueryController object at 0x7f2338846550>' into section 'controller.query'
2019-01-03 15:41:29,537 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.AuthController object at 0x7f2338846550>' into section 'controller.auth'
2019-01-03 15:41:29,537 (DEBUG) cement.core.controller : collecting arguments/commands for <spider163.bin.cli.AuthController object at 0x7f2338846550>
2019-01-03 15:41:29,537 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.WebController object at 0x7f23388465d0>' into section 'controller.web'
2019-01-03 15:41:29,537 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.DatabaseController object at 0x7f23388465d0>' into section 'controller.database'
2019-01-03 15:41:29,537 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.SpiderController object at 0x7f23388465d0>' into section 'controller.spider'
2019-01-03 15:41:29,537 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.VersionController object at 0x7f23388465d0>' into section 'controller.base'
2019-01-03 15:41:29,537 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.QueryController object at 0x7f23388465d0>' into section 'controller.query'
2019-01-03 15:41:29,538 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.QueryController object at 0x7f23388465d0>' into section 'controller.query'
2019-01-03 15:41:29,538 (DEBUG) cement.core.controller : collecting arguments/commands for <spider163.bin.cli.QueryController object at 0x7f23388465d0>
2019-01-03 15:41:29,538 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.WebController object at 0x7f2338846650>' into section 'controller.web'
2019-01-03 15:41:29,538 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.DatabaseController object at 0x7f2338846650>' into section 'controller.database'
2019-01-03 15:41:29,538 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.SpiderController object at 0x7f2338846650>' into section 'controller.spider'
2019-01-03 15:41:29,538 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.AuthController object at 0x7f2338846650>' into section 'controller.auth'
2019-01-03 15:41:29,538 (DEBUG) cement.core.handler : merging config defaults from '<spider163.bin.cli.VersionController object at 0x7f2338846650>' into section 'controller.base'
2019-01-03 15:41:29,540 (DEBUG) cement.core.hook : running hook 'post_argument_parsing' (<function handler_override at 0x7f233f8920c8>) from cement.core.foundation
执行抓取任务遭遇配置异常: 'comments'
2019-01-03 15:41:31,600 (DEBUG) cement.core.foundation : closing the Spider163 application

[2019-01-03 07:49:19.775868] ERROR: : 解析歌曲评论的时候出现问题:'comments' 歌曲ID:1311319824 页码:1

执行抓取任务遭遇配置异常

抓取我的歌单的时候,出现了这个问题,不知道是什么意思,其他的歌单都没有问题,就这个歌单有问题。
spider163 get --playlist 127473345
执行抓取任务遭遇配置异常: 'NoneType' object has no attribute 'encode'

pip install failed on termux(Android)

Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/data/data/com.termux/files/usr/tmp/pip-build-mmjqcjfd/MySQL-python/setup.py", line 13, in <module>
        from setup_posix import get_config
      File "/data/data/com.termux/files/usr/tmp/pip-build-mmjqcjfd/MySQL-python/setup_posix.py", line 2, in <module>
        from ConfigParser import SafeConfigParser
    ModuleNotFoundError: No module named 'ConfigParser'

    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /data/data/com.termux/files/usr/tmp/pip-build-mmjqcjfd/MySQL-python/

top50 cmd登录那块怎么用?playlist是不是写成palylist了?

1 执行抓取任务遭遇配置异常: HTTPConnectionPool(host='music.163.com', port=80): Max retries exceeded with url: /weapi/login/cellphone (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7ffadedb1dd0>, 'Connection to music.163.com timed out. (connect timeout=10)'))

2 Spider163: error: unrecognized arguments: --palylist=389354428

3 执行抓取任务遭遇配置异常: 'NoneType' object has no attribute 'getitem'这个是数据库的问题吗?

配置异常: HTTPConnectionPool(host='music.163.com', port=80): Read timed out.

执行 spider163 mp3 --playlist=2236351380 最后报错:

执行抓取任务遭遇配置异常: HTTPConnectionPool(host='music.163.com', port=80): Read timed out. (read timeout=10)

curl http://music.163.com:80 -v

  • Rebuilt URL to: http://music.163.com:80/
  • Trying 103.65.41.126...
  • Connected to music.163.com (103.65.41.126) port 80 (#0)

GET / HTTP/1.1
Host: music.163.com
User-Agent: curl/7.47.0
Accept: /

< HTTP/1.1 302 Found
< Server: nginx
< Date: Mon, 27 Aug 2018 06:53:22 GMT
< Content-Length: 0
< Connection: keep-alive
< Cache-Control: no-store
< Pragrma: no-cache
< Expires: Thu, 01 Jan 1970 00:00:00 GMT
< Cache-Control: no-cache
< Location: https://music.163.com/
< X-Via: MusicEdgeServer
< X-From-Src: 183.205.107.38
<

  • Connection #0 to host music.163.com left intact

看起来网络正常,请问这怎么解决?

执行抓取任务遭遇配置异常

执行抓取任务遭遇配置异常:a bytes-like object is required, not 'str'

之前可以,最近用了一次突然不行了,请问是为什么呢

反爬问题

你好,
我之前也写过一个类似的爬取163的音乐评论。但是房到服务器上跑的时候不一会IP就被封了。后来用了代理但是速度太慢。

我看了您的代码但是没有找到在哪里进行的反爬虫机制。请问您是怎么进行反爬的问题的?

依赖很多但是并没有一个freeze文件

建议添加freeze文件自动搞定依赖。
要不然需要手动安装非常多的lib,而且也会有问题
比如Crypto库,
pip install Crypto 是错误的
pip install pycrpto 才是正确的。

使用spider163 mp3 --playlist无法下载mp3

显示如下:
[root@mmmm spider163]# spider163 mp3 --playlist 409933862
正在下载歌曲 爱情废柴-周杰伦.mp3
执行抓取任务遭遇配置异常: Invalid URL 'None': No schema supplied. Perhaps you meant http://None?
Google了一下,也不知道怎么解决。
请问是什么原因?

关于评论获取

评论获取只能获取前10条吗?不能获得全部的评论吗?

初始化数据库出现问题

image
您好,刚开始是Mysql登录问题,我已经解决了,但是现在出现这个问题,怎么也解决不了,希望能帮忙看一下,指导我怎么解决,谢谢~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.