0

I'm trying to request some data from an api using python!

Here's the code I am using:

import concurrent.futures
import requests

def my_function(url):
    response = requests.get(url)
    with open("data.csv", "a+") as f:
        f.write(response+'\n')

with concurrent.futures.ThreadPoolExecutor() as executer:
    for i in range(9120000000, 9130000000):
        try:
            url = f"https://mysite.test/api/v1/{str(i)}"
            executer.submit(my_function, url)
        except:
            "There was an error!"

Now here are my questions:

1- The code I'm currently using is going to open up 10,000,000 threads at the same time?is that even possible?
2- If so, then is this considered a ddos attack on that server?
3- Is there a way to subset that range into smaller ranges, lets say, 200-interval based ranges?
4- Is there a better way to do this?

This script is going to be deployed on heroku servers. According to the heroku documentations a free dyno could support no more than 256 threads.

Any help and/or suggestions would be greatly appreciated!

2
  • 1) Take a look at max_workers 2) Probably would be unwise to hit a site 10,000,000 times at once, especially from a single IP Address 3) Same as 1 4) Same as 1 Commented Jan 11, 2021 at 20:03
  • @goalie1998 You're right, it's unwise!!! i haven't done it yet, that's why I'm asking! By the way, the script is going to change ip addresses, I didn't include that part here for simplicity sake. it seems I could define the number of workers. Haaaa! Thanks for your help, i appreciate it! Commented Jan 11, 2021 at 20:13

1 Answer 1

1

1- The code I'm currently using is going to open up 10,000,000 threads at the same time?is that even possible?

No, you will only ever open up as many threads specified with the max_workers parameter. Since you don't specify it when you initialize your ThreadPoolExecutor, it will take the default value which changes depending on your version of python. You can read about it more here: ThreadPoolExecutor docs. But in short, it will be far less than 10,000,000 threads concurrently.

2- If so, then is this considered a ddos attack on that server?

Depends on the server. If your server allows you to have the amount of executions that threadpool executor will give you concurrently, then no.

3- Is there a way to subset that range into smaller ranges, lets say, 200-interval based ranges?

You could add time.sleep(seconds_to_sleep) to your scripts to slow it down if you're worried about overloading the server. That would put a pause on your script between starting new threads.

4- Is there a better way to do this?

Hard to say given the limited context of what this is trying to accomplish. It may be better if the API allowed for batch requests where you could specify a range of digits and be returned that range's information. That would be more efficient, but that assumes you have the power to change the api you're calling.

Sign up to request clarification or add additional context in comments.

3 Comments

thanks for the help. I actually didn't know I could sleep between threads!
Even if the server has less threads that the "attacking" script, it is still not a DDOS attack simply because the first D, in the word, stands for Distributed, which means from many places. The correct phrase would be a DOS attack since it's only OP who is running the script.
@Countour-Integral well this script isn't written to "attack"! That's why i'm asking before running it. You're right i should have used the correct terminology, yes it'd be DOS attack.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.