Deion Initialising some asyncio based library resources (cli

Only process item <a href="https://docs.scrapy.org/en

execution of asyncio.ensure_future(coro()) ignored on close_spider() pipelines call about scrapy HOT 4 CLOSED

abebus commented on June 15, 2024

execution of asyncio.ensure_future(coro()) ignored on close_spider() pipelines call

from scrapy.

Comments (4)

abebus commented on June 15, 2024 1

Thanks, didn't know crawler can automatically await on async functions connected via signals. The following code works as expected:

import asyncio
import logging
from scrapy import signals
import aiohttp

class AsynctestPipeline:
    async def ainit(self):
        logging.critical('async init')
        self.client = aiohttp.ClientSession()
        self.something = await self.client.get('https://scrapy.org/')
        logging.critical('async initialised')

    async def adel(self):
        logging.critical('async closing resources')
        await self.client.close()
        logging.critical('async resources closed')

    @classmethod
    def from_crawler(cls, crawler):
        p = cls()
        crawler.signals.connect(p.ainit, signal=signals.spider_opened)
        crawler.signals.connect(p.adel, signal=signals.spider_closed)
        return p
        

    async def process_item(self, item, spider):
        logging.critical('executing async task')
        await asyncio.sleep(10)
        logging.critical('async task done')
        logging.critical(self.something)
        return item

from scrapy.

Gallaecio commented on June 15, 2024

Only process item has coroutine support. Try using the spider_closed signal instead.

from scrapy.

abebus commented on June 15, 2024

but why then open_spider works?

from scrapy.

Gallaecio commented on June 15, 2024

My guess: both work, only that the spider closes before the close one gets executed. Do you get any mentions in the standard output about unawaited futures? If you add a long sleep on the close spider signal, the close one might work. In any case, it is not intended to work in open_spider either, so I strongly suggest not to do that either, there is also an spider_open signal.

from scrapy.

Recommend Projects

execution of asyncio.ensure_future(coro()) ignored on close_spider() pipelines call about scrapy HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent