2

I'm trying to wrap my head around asyncio and aiohttp and for the first time in years programming makes me feel utterly stupid and incapable. Which is kind of beautiful, in a weirdo Zen way. But alas, there's work to get done.

I've got an existing class that can do numerous wondrous things on the web, like signing up to a web site, getting data, the works. And now I need like, 100 or 1000 of these little worker bees to sign up. Code looks roughly like this:

class Worker(object):
    def signup(self, ...):
        ...
        data = self.make_request(url, data)
        self.user_id = data.get("user_id")
        return self

    def make_request(self, url, data):
        response = requests.post(url, data=data)
        return response.json()

workers = [Worker().signup() for n in range(100)]

As you can see, we're using the requests module to make a POST request. However this is blocking, so we'll have to wait for worker N to finish signing up before we start signing up worker N+1. Fortunately, the original author of the Worker class (that sounds charmingly Marxist) in her infinite wisdom wrapped every HTTP call in the self.make_request method, so making the whole Worker non blocking should just be a matter of swapping out the requests library for a non-blocking one aaaaand bob's your uncle, right? This is how far I got:

class AyncWorker(Worker):
    @asyncio.coroutine
    def make_request(self, url, data):
        response = yield from aiohttp.request('post', url, data=data)
        return (yield from response.json())

coroutines = [Worker().signup() for n in range(100)]
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait(coroutines))
loop.close()

But this will raise an AttributeError: 'generator' object has no attribute 'get' in the signup method where I do self.user_id = data.get("user_id"). And beyond that, I still don't have the workers in a neat dictionary. I'm aware that I'm most likely completely misunderstanding how asyncio works - but I already spent a day reading through various docs, mind-shattering tutorials by David Beazly, and masses of toy examples that are simply enough that I understand them and too simple to apply to this situation. How should I structure my worker and my async loop to sign up 100 workers in parallel and eventually get a list of all workers after they signed up?

1
  • 1
    Once you touch asyncio coroutines, you MUST work with them in coroutine-way. It means that your signup method must be a coroutine that does something like data = yield from self.make_request(url, data). Everything you return in asyncio coroutine must be consumed only with yield from. Commented Oct 28, 2014 at 11:40

1 Answer 1

3

Once you use the yield (or yield from) in a function, this function becomes a coroutine. It means that you can't get a result by just calling it: you will get a generator object. You must at least do this:

@asyncio.coroutine
def some_coroutine(*args):
    #...
    #...
    yield from tasty.asyncio.function()
    return result

def coroutine_user():
    # data = some_coroutine() will give you a generator object instead of result
    data = yield from some_coroutine()
    return data # data here is a plain result: you can call your .get or whatever

Guess what happens when you call coroutine_user():

>>> coroutine_user()
<generator object coroutine_user at 0x7fe13b8a47e0>

Lack of async.coroutine decorator doesn't help at all: coroutines are contagious! To get a result in a function, you must use yield from. It turns your function into another coroutine!

Though things aren't always that bad (usually you can manually iterate a generator object without relying on yield from), asyncio will specifically stop you from doing it: it breaks some internals (you can do it only from Future or asyncio.coroutine). So just use concurrent.futures or something similar unless you're going to turn all your code into coroutines. As some alternative, isolate all users of aiohttp.request from usual methods and work with both coroutine-based async workers and synchronous plain old code. Diving into asyncio and actually refactoring all your code is an option too, obviously: you basically need to put yield from before every call to any infected with asyncio method.

Sign up to request clarification or add additional context in comments.

1 Comment

Just to clarify: you have to use yield from with asyncio coroutines because that's the only way to return control to the event loop and allow the coroutine to actually run. If you were allowed to just iterate over the asyncio.Future, you'd never give control back the event loop, the coroutine would never actually run, and you'd end up trying to access the result of the Future before it's actually available. concurrent.futures.Future gets its concurrency from multiple threads or processes, not non-blocking I/O and an event loop, so it doesn't have this restriction.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.