run a function on different nodes of a slurm cluster for different parameters

Question

How do I call the function inner from outer, such that each call to inner runs on a different node? That is, for ij = 1, it runs on node 1 using all of its 16 cores, for ij = 2, it runs on node 2 using all of its 16 cores, and so on?

using Distributed
addprocs(32)
    
println("Number of processes: ", nprocs())
println("Number of workers: ", nworkers())


@everywhere function inner(a,ij)
   sleep(5);
   println("Inside inner")

   return a*ij;
end

function outer(a,N)
   tt0 = time()
   g(x) = ij -> inner(x, ij);
   arrsum = sum(pmap(g(a), (1:N)));
   tt1 = time()
   println("outer time = $(tt1-tt0)")

   return arrsum
end

println("outer = ",outer(1,5))

I am using this slurm submission script

#!/bin/bash

#SBATCH -J m_node
#SBATCH -t 0-04:00:00
#SBATCH --nodes 2
#SBATCH --ntasks-per-node 1
#SBATCH --cpus-per-task=16

srun /home/userdir/julia-1.10.4/bin/julia  /home/userdir/Work/julia_mnode.jl

This is not giving me the desired behaviour. Instead, each call to inner is using only 1 core, spread out over the 2 nodes.

Przemyslaw Szufel · Accepted Answer · 2024-08-16 15:55:03Z

0

Your pmap should take a function as its arg.

Hence your code should be:

function outer(a,N)
   tt0 = time()
   g(_) = ij -> inner(a, ij);
   arrsum = sum(pmap(g, 1:N));
   tt1 = time()
   println("outer time = $(tt1-tt0)")
   return arrsum
end

answered Aug 16, 2024 at 15:55

Przemyslaw Szufel

42.5k3 gold badges42 silver badges73 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

run a function on different nodes of a slurm cluster for different parameters

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related