2

I like to do this:

program main
  implicit none
  integer l
  integer, allocatable, dimension(:) :: array

  allocate(array(10))

  array = 0

  !$omp parallel do private(array)
  do l = 1, 10
    array(l) = l
  enddo
  !$omp end parallel do

  print *, array

  deallocate(array)

end

But I am running into error messages:

* glibc detected * ./a.out: munmap_chunk(): invalid pointer: 0x00007fff25d05a40 *

This seems to be a bug in ifort according to some discussions at intel forums but should be resolved in the version I am using (11.1.073 - Linux). This is a MASSIVE downscaled version of my code! I unfortunately can not use static arrays to have a workaround.

If I put the print into the loop, I get other errors:

* glibc detected ./a.out: double free or corruption (out): 0x00002b22a0c016f0 **

6
  • 2
    Your code snippet is too incomplete and inconsistent with your description (only j and l are private according to the code snippet) to be useful. Chop your actual code down further to a small example that still exhibits the problem, and then show that, ideally in its entirety. While you're at it, make sure your threads have sufficient stack space and pop a DEFAULT(NONE) clause in on the OMP directive. Along the way chances are you'll figure out what's wrong... Commented Feb 19, 2013 at 11:41
  • @IanH I reformulated the question with better code. I also found a hint to where the error may be found. Commented Feb 19, 2013 at 13:48
  • What version of ifort are you using? Your code runs here with 13.1.0. Commented Feb 19, 2013 at 19:52
  • @IanH We're having 11.1.073. Commented Feb 20, 2013 at 7:41
  • Does that version support OpenMP 3.0 (necessary for allocatables)? Regardless, might be time for an update. Commented Feb 20, 2013 at 8:07

3 Answers 3

1

I didn't get the errors you're getting, but you have an issue with privatizing array in your OpenMP call.

[mjswartz@666-lgn testfiles]$ vi array.f90  
[mjswartz@666-lgn testfiles]$ ifort -o array array.f90 -openmp  
[mjswartz@666-lgn testfiles]$ ./array 
           0           0           0           0           0           0
           0           0           0           0  
[mjswartz@666-lgn testfiles]$ vi array.f90  
[mjswartz@666-lgn testfiles]$ ifort -o array array.f90 -openmp  
[mjswartz@666-lgn testfiles]$ ./array 
           1           2           3           4           5           6
           7           8           9          10

First run is with private array, second is without.

  program main
  implicit none

  integer l
  integer, allocatable, dimension(:) :: array

  allocate(array(10))

  !$omp parallel do 
  do l = 1, 10
    array(l) = l
  enddo

  print*, array

  deallocate(array)

  end program main
Sign up to request clarification or add additional context in comments.

6 Comments

Which system are you using?
Linux Mint, Intel ifort version 11.1.073
I'm running off of a supercomputer with all of their compilers in order. It might be worth it (short term and long term) for you to try and get access to a university (or government) supercomputer so you can get real multi-threading instead of hyper-threading on a local PC.
Actually, I am at a university. We have 2 opportunities here: We use all our working PC's and manage jobs using a queue system. And we have a cluster with 96 cores which we can use. Anyway, I need to get the program to work before I can submit jobs! The compiler version is for all the same.
Why not update to OpenMP 4.0? Or since you're running linux, install fopenmp, which worked in all my local testing.
|
1

I just ran your code with ifort and openmp and it spewed 0d0's. I had to manually quit the execution. What is your expected output? I'm not a big fan of unnecessarily dynamically allocating arrays. You know what you're going to allocate your matrices as, so just make parameters and statically do it. I'll mess with some stuff and edit this response in a few.

Ok, so here's my edits:

  program main
  implicit none

  integer :: l, j
  integer, parameter :: lmax = 15e3
  integer, parameter :: jmax = 25
  integer, parameter :: nk = 300
  complex*16, dimension(9*nk) :: x0, xin, xout
  complex*16, dimension(lmax) :: e_pump, e_probe
  complex*16 :: e_pumphlp, e_probehlp
  character*25 :: problemtype
  real*8 :: m

  ! OpenMP variables
  integer :: myid, nthreads, omp_get_num_threads, omp_get_thread_num

  x0 = 0.0d0

  problemtype = 'type1'
  if (problemtype .ne. 'type1') then
     write(*,*) 'Problem type not specified. Quitting'
     stop
  else
     ! Spawn a parallel region explicitly scoping all variables
     !$omp parallel 
        myid = omp_get_thread_num()
        if (myid .eq. 0) then
           nthreads = omp_get_num_threads()
           write(*,*) 'Starting program with', nthreads, 'threads'
        endif

        !$omp do private(j,l,m,e_pumphlp,e_probehlp,e_pump,e_probe)
        do j = 1, jmax - 1
           do l = 1, lmax

              call electricfield(0.0d0, 0.0d0, e_pumphlp, &
                                 e_probehlp, 0.0d0)
              !   print *, e_pumphlp, e_probehlp

              e_pump(l) = e_pumphlp
              e_probe(l) = e_probehlp
              print *, e_pump(l), e_probe(l)

           end do
        end do
     !$omp end parallel
  end if

  end program main

Notice I removed your use of a module since it was unnecessary. You have an external module containing a subroutine, so just make it an external subroutine. Also, I changed your matrices to be statically allocated. Case statements are a fancy and expensive version of if statements. You were casing 15e3*25 times rather than once (expensive), so I moved those outside. I changed the OpenMP calls, but only semantically. I gave you some output so that you know what OpenMP is actually doing.

Here is the new subroutine:

  subroutine electricfield(t, tdelay, e_pump, e_probe, phase)
  implicit none

  real*8, intent(in) :: t, tdelay
  complex*16, intent(out) :: e_pump, e_probe
  real*8, optional, intent (in) :: phase

  e_pump = 0.0d0
  e_probe = 0.0d0

  return

  end subroutine electricfield

I just removed the module shell around it and changed some of your variable names. Fortran is not case sensitive, so don't torture yourself by doing caps and having to repeat it throughout.

I compiled this with

ifort -o diffeq diffeq.f90 electricfield.f90 -openmp

and ran with

./diffeq > output

to catch the program vomiting 0's and to see how many threads I was using:

(0.000000000000000E+000,0.000000000000000E+000) (0.000000000000000E+000,0.000000000000000E+000) (0.000000000000000E+000,0.000000000000000E+000) (0.000000000000000E+000,0.000000000000000E+000) (0.000000000000000E+000,0.000000000000000E+000) (0.000000000000000E+000,0.000000000000000E+000) (0.000000000000000E+000,0.000000000000000E+000) (0.000000000000000E+000,0.000000000000000E+000) (0.000000000000000E+000,0.000000000000000E+000) (0.000000000000000E+000,0.000000000000000E+000)
Starting program with 32 threads
(0.000000000000000E+000,0.000000000000000E+000) (0.000000000000000E+000,0.000000000000000E+000) (0.000000000000000E+000,0.000000000000000E+000) (0.000000000000000E+000,0.000000000000000E+000) (0.000000000000000E+000,0.000000000000000E+000) (0.000000000000000E+000,0.000000000000000E+000)

Hope this helps!

5 Comments

Hi. I just reformulated the question with better code. Thanks for your reply. Unfortunately, I cannot use static arrays.
What's prohibiting you from doing so?
Initially, I mentioned to use this code for a Runge Kutta method. I need to read in e.g. the number of time points from a file. That means that I don't know how big the array is and so I need it to be allocatable.
Ahh, makes sense. Then ignore my static allocation. Let me test out your new question with ifort and see if I run into the same problem.
There are good reasons to regard external subroutines as two decades past obsolete, bar some relatively special cases which don't apply here. The implicit and unqualified recommendation to convert a module procedure to an external procedure is dangerous.
0

It would appear that you are running into a compiler bug associated with the implementation of OpenMP 3.0.

If you can't update your compiler, then you will need to change your approach. There are a few options - for example you could make the allocatable arrays shared, increase their rank by one and have one thread allocate them such that the extent of the additional dimension is the number of workers in the team. All subsequent references to those arrays then need to be have the subscript for that additional rank be the omp team number (+ 1, depending on what you've used for the lower bound).

Explicit allocation of the private allocatable arrays inside the parallel construct (only) may also be an option.

3 Comments

Ok thanks. I'll try allocating inside the parallel region. I think this option is less work than the other.
I'm running into some problems when trying to allocate within the parallel region. I cannot use the array outside the parallel statement. Is it possible to allocate the array inside the parallel region and use it outside?
Both of those options required allocation inside the parallel region (in the shared array case - that's when the number of team members is known). If the array is shared you can access it after the parallel construct finishes. If the array is private you cannot - the array outside of the parallel construct is different to the array inside the construct (e.g. for the example source currently in your question the print statement should only print zeros)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.