1

I am new to python and trying to learn asyncio module. I am frustrated on getting return values from async tasks. There is a post here talked about this topic, but it can't tell which value is returned by which task(assuming some one web page response faster than another).
The code below is trying to fetch three web pages concurrently instead of doing it one by one.

    import asyncio
    import aiohttp

    async def fetch(url):
        async with aiohttp.ClientSession() as session:
            async with session.get(url) as resp:
                assert resp.status == 200
                return await resp.text()

    def compile_all(urls):
        tasks = []
        for url in urls:
            tasks.append(asyncio.ensure_future(fetch(url)))
        return tasks

    urls = ['https://python.org', 'https://google.com', 'https://amazon.com']
    tasks = compile_all(urls)
    loop = asyncio.get_event_loop()
    a, b, c = loop.run_until_complete(asyncio.gather(*tasks))
    loop.close()

    print(a)
    print(b)
    print(c)

First, it hit Runtimeerror though it did print some html documents: RuntimeError: Event loop is closed.

Second, question is: does this really guarantee that a, b, c will be corresponded to the urls list in sequence of urls[0], url[1], urls[2] web page? (I assume that async tasks execution won't guarantee that).

Third, any other better means or Should I use Queue in this case? if yes, how?

Any help will be greatly appreciated.

5
  • Using await will pause the execution of the function until it returns something, but will not block other async functions from running. If you are new to python, I would not recommend using asyncio for multiprocessing, as it is really complex and is best suited for more experienced people. Commented Mar 19, 2021 at 15:35
  • Also, your question is weirdly asked. What do you mean by "Python web page"? Commented Mar 19, 2021 at 15:36
  • have to use asyncio which is required by job and that is the reason I need to learn it.. I have modified my question expression, sorry about that. Commented Mar 19, 2021 at 15:45
  • I understand. But what do you mean "Python webpage"? Commented Mar 19, 2021 at 15:45
  • I cannot reproduce the RuntimeError you observe. And yes, the results will correspond to the sequence of URLs, regardless of the order in which they complete. Commented Mar 19, 2021 at 21:01

1 Answer 1

2

The order of the results will correspond to the order of the urls. Take a look at the docs for asyncio.gather:

If all awaitables are completed successfully, the result is an aggregate list of returned values. The order of result values corresponds to the order of awaitables in aws.

To process tasks as they complete you can use asyncio.as_completed. This post has more information on how it can be used.

Sign up to request clarification or add additional context in comments.

8 Comments

Thank you for the answer. This confirmed that gather() only return a list of results after all tasks are completed... However, it is against my intent of using asyncio to fetch web pages (or to do I/O tasks) concurrently. My intent of using asyncio is to avoid being blocked by any slow response web site. At the end, it is same as fetching them one by one.... any other better way?
@Spark You could set a timeout in aiohttp ClientSession to abort slow web sites.
Thank you. setting timeout means I have to give up those slow web sites even though they may be just occasionally slow. My thought was assuming the event loop could return result from whichever task complete first by a mechanism like first in first out buffer or Queue, so asynchronous execution of multi tasks really become pipeline spitting results...
@Spark It's not the same. If you're fetching there sites which take 1, 2 and 3 seconds respectively, fetching them one by one would take six seconds. Fetching them with gather() would take just 3 seconds, so it's clearly better. If you want to react to the first one second, you can use asyncio.as_completed.
@user4815162342. Thank you so much, it looks as_completed working exactly as my intent of using asyncio though it return generators. I got three html documents with google coming out first (as expected)....About the RuntimeError,, I can't reproduce it too some time, but I find this explained. BTW, I am using python 3.8
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.