2

I am trying to understand how a Coarray Fortran DLL can be possibly called from Python. Consider the following sample Fortran module file example_mod.f90 which is to be called from Python later:

module example_mod
    use iso_c_binding
    implicit none
#ifdef COARRAY_ENABLED
    integer :: co_int[*]
#endif
    interface
    module subroutine sqr_2d_arr(nd, val, comm) BIND(C, NAME='sqr_2d_arr')
        !DEC$ ATTRIBUTES DLLEXPORT :: sqr_2d_arr
        integer, intent(in)     :: nd
        integer, intent(inout)  :: val(nd, nd), comm
    end subroutine sqr_2d_arr
    end interface
contains
end module example_mod

with the subroutine's implementation given in the submodule file example_mod@sub_smod.f90 :

submodule (example_mod) sub_smod
    implicit none
contains
    module procedure sqr_2d_arr

        use mpi
        integer :: rank, size, ierr

        integer :: i, j

        call MPI_Comm_size(comm, size, ierr)
        call MPI_Comm_rank(comm, rank, ierr)
        write(*,"(*(g0,:,' '))") "Hello from Fortran MPI! I am process", rank, "of", size, ', comm:', comm

        write(*,"(*(g0,:,' '))") "Hello from Fortran COARRAY! I am image ", this_image(), " out of", num_images(), "images."
        sync all

        do j = 1, nd
            do i = 1, nd
                val(i, j) = (val(i, j) + val(j, i)) ** 2
            enddo
        enddo

    end procedure sqr_2d_arr
end submodule sub_smod

The subroutine also contains calls to MPI library for the sake of comparison with Coarray. I compile this code with the following ifort flags:

mpiifort /Qcoarray=distributed /Od /debug:full /fpp -c example_mod.f90
mpiifort /Qcoarray=distributed /Od /debug:full /fpp -c example_mod@sub_smod.f90
mpiifort /Qcoarray=distributed /Od /debug:full /fpp /dll /libs:dll /threads example_mod.obj example_mod@sub_smod.obj

Now, I have the following Python2 script which calls the generated DLL above:

#!/usr/bin/env python

from __future__ import print_function
from mpi4py import MPI


comm = MPI.COMM_WORLD
fcomm = MPI.COMM_WORLD.py2f()
print("Hello from Python! I'm rank %d from %d running in total..." % (comm.rank, comm.size))

comm.Barrier()   # wait for everybody to synchronize _here_

######################

import ctypes as ct
import numpy as np

# import the dll
fortlib = ct.CDLL('example_mod.dll')

# setup the data
N = 2
nd = ct.pointer( ct.c_int(N) )          # setup the pointer
pyarr = np.arange(0, N, dtype=int) * 5  # setup the N-long
for i in range(1, N):                   # concatenate columns until it is N x N
    pyarr = np.c_[pyarr, np.arange(0, N, dtype=int) * 5]

# call the function by passing the ctypes pointer using the numpy function:
fcomm_pt = ct.pointer( ct.c_int(fcomm) )
_ = fortlib.sqr_2d_arr(nd, np.ctypeslib.as_ctypes(pyarr),fcomm_pt)

print(pyarr)

Running this script with the following command:

mpiexec -np 4 python main.py

yields this output:

Hello from Fortran MPI! I am process 1 of 4 , comm: 1140850688
Hello from Fortran MPI! I am process 3 of 4 , comm: 1140850688
Hello from Fortran COARRAY! I am image  1  out of 0 images.
Hello from Fortran MPI! I am process 0 of 4 , comm: 1140850688
Hello from Fortran COARRAY! I am image  1  out of 0 images.
Hello from Fortran MPI! I am process 2 of 4 , comm: 1140850688
Hello from Fortran COARRAY! I am image  1  out of 0 images.
Hello from Fortran COARRAY! I am image  1  out of 0 images.
Hello from Python! I'm rank 3 from 4 running in total...
[[  0  25]
 [900 100]]
Hello from Python! I'm rank 0 from 4 running in total...
[[  0  25]
 [900 100]]
Hello from Python! I'm rank 1 from 4 running in total...
[[  0  25]
 [900 100]]
Hello from Python! I'm rank 2 from 4 running in total...
[[  0  25]
 [900 100]]

The computations performed in this set of codes is not important or relevant to the discussion here. However, I cannot understand why the MPI ranks are properly output, while the Coarray num_images() is zero for all processes. As a broader question, what is the best strategy to write a Coarray Fortran application that can be called from other languages such as Python?

1
  • I understand that you have incorrect result when using multiprocessing. I heard that sharing a DLL while using python multiprocessing doesn't work very well. A workaround is to create a copy of the DLL for each process. Commented Feb 16, 2019 at 8:45

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.