0

I would like to know if there is any possible way to create a custom script that can use django imports and run in parallel from a python shell. for example, imagine I could write this on the shell:

@make_it_parallel
def my_custom_task():
    from mydjangomodule.models import myclass
    # do something

for user_range in range(0, 1000, 10):
    my_custom_task.delay(user_range)

The main question is "how to easily implement parallelism over a script" instead of having to push something to production or setup a complete set of tools for a script that only needs to run once.

P.D.: If there is another tool besides IPython/Celery that can do the job I would be interest to hear about it.

4
  • no, this is in a django shell by running ./manage.py shell Commented Dec 10, 2015 at 23:11
  • Do u need to run it from within your application? If not can't u just create a custom Django command? Commented Dec 11, 2015 at 6:42
  • Please have a look at docs.djangoproject.com/en/1.9/howto/custom-management-commands . At least all django modules/your own models will be available in this commands. Commented Dec 11, 2015 at 6:58
  • This is what I want to avoid, having to push to production. I want to run a custom script in the shell but be able to run it in parallel with ease Commented Dec 11, 2015 at 14:49

1 Answer 1

1

I end up parallelizing django scripts with ipython ipyparallel package, here is how to do it!

first you need to install ipyparallel: pip install ipyparallel -U

on the default profile from ipython (or any profile of your preference) we need to add this startup imports to load django:

from MyProject import settings
import django
django.setup()

This should be added to a path like this: ~/.ipython/profile_default/startup/00-load-django.py

This will load django on the engines you need to start so ipython can parallelize your functions.

Now, lets start the engines that will be able to parallelize your django scripts coded on the fly, be sure to be in the main folder from your django project (where the manage.py file is): ipcluster start -n X where X will be the number of engines wanted (IMHO it would be the number of cores in the current computer + 1)

Please let the ipcluster be fully operational before entering into ipython.

Now, lets parallelize that django script, enter into ipython:

import ipyparallel as ipp
rc = ipp.Client()  # Create the client that will connect to the ipython engines
lview = rc.load_balanced_view()

@lview.parallel()
def show_polls(user_range):
    from poll.models import Poll
    return list(Poll.objects.filter(user_id__gte=user_range, user_id__lt=user_range+100))

for res in show_polls.map(range(0, 1000, 100)):
    print res

and there we go, a django script parallelized! notice that I convert the QuerySet into a list, that's because anything returned needs to be pickable.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.