Comments (9)
root 1311 0.1 2.8 188344 57460 pts/8 S 13:04 0:00 /usr/bin/python3 /usr/local/bin/celery beat -A tasks.workers -l info
root 1379 0.0 0.0 14220 1084 pts/8 S+ 13:12 0:00 grep --color=auto celery
[1]- Exit 1 nohup celery -A tasks.workers -Q login_queue,user_crawler,fans_followers,search_crawler,home_crawler worker -l info -c 1
这条语句被exit是为什么?
from weibospider.
root@iZ2zeftexcphcu8dj9if0mZ:/home/admin/project/weibospider# jobs -l
[2]- 1311 Running nohup celery beat -A tasks.workers -l info &
[3]+ 1398 Running nohup celery -A tasks.workers -Q login_queue,user_crawler,fans_followers,search_crawler,home_crawler worker -l info -c 1 &
这个是正常的时候,不正常爬取的时候,jobs中就只剩下 nohup celery -A tasks.workers -Q 了
这是因为什么呢
from weibospider.
1.你检查一下是否能用你的账号进行搜索,因为微博封账号的情况很复杂,它可能只封锁你一个功能
2.如果你重启celery worker之后,它会直接继续执行上次没执行完的任务,你如果要让它从当前时刻开始执行你指定的任务,需要清除redis db5和6
from weibospider.
我试了试没有被封,之前enable从1变0是不是封了我几个小时?我现在update账号enable了以后,爬取时候,这条语句老是被exit,任务就停止了:
celery -A tasks.workers -Q login_queue,user_crawler,fans_followers,search_crawler,home_crawler worker -l info -c 1
from weibospider.
1.你在basic.py中打印一下response.status_code
和response.text
,看看响应是否是正常的
2.把redis的db1(cookies)、db5和6都清空,再启动worker和相关任务调度器
from weibospider.
2018-03-28 14:11:50 - crawler - INFO - the crawling url is http://weibo.com/p/1005052244164900/follow?page=1#Pl_Official_HisRelation__60
[2018-03-28 14:11:50,662: INFO/ForkPoolWorker-1] the crawling url is http://weibo.com/p/1005052244164900/follow?page=1#Pl_Official_HisRelation__60
2018-03-28 14:11:50 - crawler - ERROR - failed to crawl http://weibo.com/p/1005052244164900/follow?page=1#Pl_Official_HisRelation__60,here are details:MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error., stack is File "/home/admin/project/weibospider/decorators/decorator.py", line 14, in time_limit
return func(*args, **kargs)
[2018-03-28 14:11:50,664: ERROR/ForkPoolWorker-1] failed to crawl http://weibo.com/p/1005052244164900/follow?page=1#Pl_Official_HisRelation__60,here are details:MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error., stack is File "/home/admin/project/weibospider/decorators/decorator.py", line 14, in time_limit
return func(*args, **kargs)
[2018-03-28 14:11:50,667: WARNING/ForkPoolWorker-1] /root/anaconda3/lib/python3.6/site-packages/pymysql/cursors.py:166: Warning: (1062, "Duplicate entry '6490414635' for key 'uid'")
result = self._query(query)
[2018-03-28 14:11:50,667: WARNING/ForkPoolWorker-1] /root/anaconda3/lib/python3.6/site-packages/pymysql/cursors.py:166: Warning: (1062, "Duplicate entry '3764351355' for key 'uid'")
result = self._query(query)
2018-03-28 14:11:50 - crawler - INFO - the crawling url is http://weibo.com/p/1005053105868177/info?mod=pedit_more
[2018-03-28 14:11:50,682: INFO/ForkPoolWorker-1] the crawling url is http://weibo.com/p/1005053105868177/info?mod=pedit_more
2018-03-28 14:11:50 - crawler - ERROR - failed to crawl http://weibo.com/p/1005053105868177/info?mod=pedit_more,here are details:MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error., stack is File "/home/admin/project/weibospider/decorators/decorator.py", line 14, in time_limit
return func(*args, **kargs)
[2018-03-28 14:11:50,683: ERROR/ForkPoolWorker-1] failed to crawl http://weibo.com/p/1005053105868177/info?mod=pedit_more,here are details:MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error., stack is File "/home/admin/project/weibospider/decorators/decorator.py", line 14, in time_limit
return func(*args, **kargs)
[2018-03-28 14:11:50,684: ERROR/ForkPoolWorker-1] list index out of range
[2018-03-28 14:11:50,684: ERROR/ForkPoolWorker-1] list index out of range
[2018-03-28 14:11:50,685: ERROR/ForkPoolWorker-1] list index out of range
[2018-03-28 14:11:51,745: WARNING/MainProcess] Restoring 4 unacknowledged message(s)
原因好像是因为redis,但是我不熟悉redis所以不知道怎么改,这个是日志
from weibospider.
我查了一下,好像原因是因为强制把redis快照关闭了导致不能持久化的问题,在网上查了一些相关解决方案,通过stop-writes-on-bgsave-error值设置为no即可避免这种问题。
后续继续测试下还会不会停
from weibospider.
好的,多谢反馈
from weibospider.
MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error.
这个问题也有可能是内存占用率太高导致的, 我的VPS上是这个原因
from weibospider.
Related Issues (20)
- 如何限定时间段,爬取从某年月日到某年月日的微博? HOT 3
- mysql数据库里user_relation这样表 一直是空,是哪里有问题? HOT 16
- 微博关键词搜索 create_time 一栏有四种时间格式,能否统一为一个 20**年**月**日的形式? HOT 3
- user.py 中 script.string 有bug 导致 mysql数据库表user_relation一直是空
- 抓取 user_relation。 user.py 有bug
- 无法启动worker (停在INFO/MainProcess] mingle: all alone不动)
- 启动worker时执行到**[2020-04-02 12:36:58,850: INFO/MainProcess] mingle: all alone**就不再继续 HOT 2
- 执行login_first.py之后显示ValueError: not enough values to unpack (expected 3, got 0) HOT 1
- 运行worker 就报错了,我的redie 配置和爬虫配置密码都对的:[2020-04-25 19:53:50,902: ERROR/MainProcess] consumer: Cannot connect to redis://:**@localhost:6379/6: Client sent AUTH, but no password is set.
- 云打码平台好像失效了,之前那个超级鹰平台的issues下的temp_verification我按照操作来可是出了奇怪的bug,请问能根据新的打码平台更新一下吗,麻烦了 HOT 9
- threading.Thread.isAlive has been deprecated and removed in Python 3.9 in favour of is_alive
- 登入帐号时遇到要求扫码登入,是Weibo有改版吗? HOT 2
- 爬取不到数据,启动 work 页面 输出的都是一些爬取失败 和 warning 信息 类似: HOT 1
- 非酋做配置,试错笔记 HOT 2
- 微博爬虫的合理阈值
- 请问这个爬虫爬取关键词的话是只能爬取50页的上限吗? HOT 1
- 关于更新/维护
- requirements.txt 文件当中的requests、Django版本号请进行修改下,谢谢
- 运行python config/create_all.py 报错 HOT 1
- 运行 python3 config/create_all.py 报错 HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from weibospider.