1

I have a code which posts HTTP request sequentially for all the tuples in a list and append the response to another list.

import requests

url = 'https://www....'
input_list = [(1,2),(6,4),(7,8)]
output_list = []

for i in input_list:
   resp = (requests.post('url',data = i[1]).text,i[0])
   output_list.append(resp)

print(output_list)

Can someone please help me with the directions to make the HTTP requests in parallel ?

5
  • You can do it in two ways, either using multiprocessing (all at once) or asynchronously (one by one, but overlapping while executing), You have to first decide what is the most time consuming part, is that you have to wait a lot after you make the request, then I can help you write a solution with async programming or does the processing of the whole loop take the most time, then I can help you with multiprocessing, but first you have to decide which concurrency approach gives the best solution for your problem. Commented Jun 6, 2021 at 6:52
  • @SAK, There is not a big wait time i'll say. So async programming will be a good suit for my problem. Commented Jun 6, 2021 at 6:56
  • If there is no wait time after sending the request then async programming will actually make it slower, for example if you make the request and it takes 500 ms to get the response then while the system waits for the response it will execute the next iteration in the loop, instead if it only takes like 5ms to complete then async will actually add more overhead and make it slower. Commented Jun 6, 2021 at 7:01
  • @SAK, Each request is taking ~60 ms. Which approach do you think will be efficient ? Commented Jun 6, 2021 at 7:04
  • 60ms is actually way to less for multiprocessing and not suitable for async, but that said if you really want to save time in this case it would be setting up a multiprocessing pool executor which is the most easiest and standard way of doing in python. I would post an answer with the multiprocessing solution in a shortwhile. Commented Jun 6, 2021 at 7:06

1 Answer 1

2

Since requests library doesn't support asyncio natively, I'd use multiprocessing.pool.ThreadPool, assuming the most of the time is spent waiting for IO. Otherwise it might be beneficial to use multiprocessing.Pool

from multiprocessing.pool import ThreadPool
import requests

url = 'https://www....'
input_list = [(1,2),(6,4),(7,8)]

def get_url(i):
  return (requests.post('url',data = i[1]).text,i[0])

with ThreadPool(10) as pool: #ten requests to run in paralel
  output_list = list(pool.map(get_url, input_list))
Sign up to request clarification or add additional context in comments.

1 Comment

threadpool is now min(32, your_cpu+4) so you no need to define workers, moreover asyncio.to_thread looks much more neat

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.