I have 3 python scripts
- grab.py: Takes as input an Instagram account name and outputs a text file containing all of its followers.
- scrape.py: Takes as input the output from grab.py and outputs details of each account (follower count, post count etc) in csv form
- analyze.py: A basic machine learning model that uses results of scrape.py to perform an analysis on the accounts.
The 3 scripts work as expected individually. The next step is to create an API endpoint which will take an account name as a request parameter, and then trigger the above 3 scripts for the received account. The final analysis results will be stored in a database.
The endpoint also needs to have a queueing mechanism to store account names received. The queue will be polled, and if account names are available, they will be processed sequentially.
My API development experience is limited so I am not sure of the best approach to tackle this problem. My questions are:
- Should my API endpoint be written in Python? If yes, is the Flask framework a viable option? If not, what are the other options I have?
- Is there some sort of a pipeline that I can use to seamlessly integrate the 3 scripts together?
- Is the idea of maintaining the queue in memory and polling it using a separate thread running an infinite while loop a good idea? Is there a better way to accomplish this?