2

The scenario goes as such...

I have two functions Function A() and Function B(). The Function A() has a for-loop which calls Function B() in every iteration. Function B() has a for-loop which iterates around 10 million times.

Function A()
{
    for (i = 0; i < 10; i++)
        Function B();
}

Function B()
{
    for (i = 0; i < 10000000; i++)
        certain_operations()
}

Now, the problem I am facing is that, A()'s for-loop's first iteration takes 1 minute to execute, second iteration takes 2 mins, third iteration takes 4 mins and so on... Even though the number of iterations is the same, the subsequent loop execution in A() increases. I have generated the following logs.

Cluster :  1
 INFO 2014-09-17 16:36:20,496 [pool-1-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 1
 INFO 2014-09-17 16:37:11,762 [pool-1-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 2
 INFO 2014-09-17 16:37:11,762 [pool-1-thread-1][] com.bng.dbscan.util.GenerateCluster - Cluster 1 - Critical Operations : 160000

Cluster :  2
 INFO 2014-09-17 16:37:11,767 [pool-2-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 1
 INFO 2014-09-17 16:38:02,365 [pool-2-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 2
 INFO 2014-09-17 16:38:02,365 [pool-2-thread-1][] com.bng.dbscan.util.GenerateCluster - Cluster 2 - Critical Operations : 40000

Cluster :  3
 INFO 2014-09-17 16:38:02,366 [pool-3-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 1
 INFO 2014-09-17 16:41:35,675 [pool-3-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 2
 INFO 2014-09-17 16:41:35,675 [pool-3-thread-1][] com.bng.dbscan.util.GenerateCluster - Cluster 3 - Critical Operations : 80000

Cluster :  4
 INFO 2014-09-17 16:41:35,676 [pool-4-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 1
 INFO 2014-09-17 16:50:17,875 [pool-4-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 2
 INFO 2014-09-17 16:50:17,875 [pool-4-thread-1][] com.bng.dbscan.util.GenerateCluster - Cluster 4 - Critical Operations : 120000

Cluster :  5
 INFO 2014-09-17 16:50:17,886 [pool-5-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 1
 INFO 2014-09-17 17:02:42,819 [pool-5-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 2
 INFO 2014-09-17 17:02:42,819 [pool-5-thread-1][] com.bng.dbscan.util.GenerateCluster - Cluster 5 - Critical Operations : 160000

Cluster :  6
 INFO 2014-09-17 17:02:42,820 [pool-6-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 1
 INFO 2014-09-17 17:07:01,916 [pool-6-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 2
 INFO 2014-09-17 17:07:01,916 [pool-6-thread-1][] com.bng.dbscan.util.GenerateCluster - Cluster 6 - Critical Operations : 40000

Cluster :  7
 INFO 2014-09-17 17:07:01,917 [pool-7-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 1
 INFO 2014-09-17 17:16:21,122 [pool-7-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 2
 INFO 2014-09-17 17:16:21,122 [pool-7-thread-1][] com.bng.dbscan.util.GenerateCluster - Cluster 7 - Critical Operations : 100000

Cluster :  8
 INFO 2014-09-17 17:16:21,123 [pool-8-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 1
 INFO 2014-09-17 17:22:40,371 [pool-8-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 2
 INFO 2014-09-17 17:22:40,371 [pool-8-thread-1][] com.bng.dbscan.util.GenerateCluster - Cluster 8 - Critical Operations : 60000

HERE, THE TIME BETWEEN Print 1 AND Print 2 IS THE EXECUTION TIME OF Function A()'s ONE ITERATION.

Any idea why this problem is occurring ? Suggestions regarding how to solve this issue will be greatly appreciated. Thanks a lot.

This is my code in case there are anomalies you can get...

Function A()
{
    for(Data d : packList)              
            List<Data> region  = getRegionData(d, dataList, axisEN, eN, cluster++);         

            //dataList -> 10 million records
            //d -> single instance of dataList
}

public List<Data> getRegionData(Data data, List<Data> listData, int axisEN, int eN, int cluster)    //Function B()
    {   
        List<Data> region = new ArrayList<Data>();
        Data temp = new Data();
        int distance;

        Logger.sysLog(LogValues.info, this.getClass().getName(), "Print 1");
        for(Data d : listData)
        {
            temp.setOffnet(java.lang.Math.abs(data.getOffnet() - d.getOffnet()));
            temp.setOnnet(java.lang.Math.abs(data.getOnnet() - d.getOnnet()));
            temp.setInet(java.lang.Math.abs(data.getInet() - d.getInet()));

            distance = temp.getOffnet() + temp.getOnnet() + temp.getInet();

            if(distance <= eN)
                if(temp.getOffnet() <= axisEN && temp.getOnnet() <= axisEN && temp.getInet() <= axisEN)
                {                   
                    d.setId(listData.indexOf(d));
                    d.setCluster(cluster);
                    d.setDistance(distance);
                    d.setCount(d.getCount() + 1);
                    region.add(d);  
                }
        }



        Logger.sysLog(LogValues.info, this.getClass().getName(), "Print 2");
        Logger.sysLog(LogValues.info, this.getClass().getName(), "Cluster " + cluster + " : " + region.size());
        return region;
    }
}

. .


EDIT 1 : Removing the listData.indexOf(d)


 Cluster :  1
 INFO 2014-09-18 14:58:28,306 [pool-1-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 1
 INFO 2014-09-18 14:59:04,233 [pool-1-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 2
 INFO 2014-09-18 14:59:04,233 [pool-1-thread-1][] com.bng.dbscan.util.GenerateCluster - Cluster 1  - Critical Operations : 120000
 Cluster :  2
 INFO 2014-09-18 14:59:04,236 [pool-2-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 1
 INFO 2014-09-18 15:01:32,566 [pool-2-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 2
 INFO 2014-09-18 15:01:32,566 [pool-2-thread-1][] com.bng.dbscan.util.GenerateCluster - Cluster 2  - Critical Operations : 120000
 Cluster :  3
 INFO 2014-09-18 15:01:32,567 [pool-3-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 1
 INFO 2014-09-18 15:09:15,873 [pool-3-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 2
 INFO 2014-09-18 15:09:15,873 [pool-3-thread-1][] com.bng.dbscan.util.GenerateCluster - Cluster 3  - Critical Operations : 200000
 Cluster :  4
 INFO 2014-09-18 15:09:15,874 [pool-4-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 1
 INFO 2014-09-18 15:13:30,019 [pool-4-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 2
 INFO 2014-09-18 15:13:30,019 [pool-4-thread-1][] com.bng.dbscan.util.GenerateCluster - Cluster 4  - Critical Operations : 80000
 Cluster :  5
 INFO 2014-09-18 15:13:30,019 [pool-5-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 1
 INFO 2014-09-18 15:24:10,602 [pool-5-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 2
 INFO 2014-09-18 15:24:10,602 [pool-5-thread-1][] com.bng.dbscan.util.GenerateCluster - Cluster 5  - Critical Operations : 160000
 Cluster :  6
 INFO 2014-09-18 15:24:10,603 [pool-6-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 1
 INFO 2014-09-18 15:29:47,132 [pool-6-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 2
 INFO 2014-09-18 15:29:47,132 [pool-6-thread-1][] com.bng.dbscan.util.GenerateCluster - Cluster 6  - Critical Operations : 60000
 Cluster :  7
 INFO 2014-09-18 15:29:47,133 [pool-7-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 1
 INFO 2014-09-18 15:38:49,182 [pool-7-thread-1][] com.bng.dbscan.util.GenerateCluster - Print 2
 INFO 2014-09-18 15:38:49,182 [pool-7-thread-1][] com.bng.dbscan.util.GenerateCluster - Cluster 6  - Critical Operations : 60000

PrintGCDetails for three iterations

After Iter1
Heap
 PSYoungGen      total 18432K, used 952K [0x9ee80000, 0xa0300000, 0xb3540000)
  eden space 15872K, 6% used [0x9ee80000,0x9ef6e248,0x9fe00000)
  from space 2560K, 0% used [0xa0080000,0xa0080000,0xa0300000)
  to   space 2560K, 0% used [0x9fe00000,0x9fe00000,0xa0080000)
 ParOldGen       total 41728K, used 0K [0x76140000, 0x78a00000, 0x9ee80000)
  object space 41728K, 0% used [0x76140000,0x76140000,0x78a00000)
 PSPermGen       total 16384K, used 1771K [0x72140000, 0x73140000, 0x76140000)
  object space 16384K, 10% used [0x72140000,0x722faf00,0x73140000)


After Iter2
Heap
 PSYoungGen      total 18432K, used 952K [0x9ef00000, 0xa0380000, 0xb35c0000)
  eden space 15872K, 6% used [0x9ef00000,0x9efee260,0x9fe80000)
  from space 2560K, 0% used [0xa0100000,0xa0100000,0xa0380000)
  to   space 2560K, 0% used [0x9fe80000,0x9fe80000,0xa0100000)
 ParOldGen       total 41728K, used 0K [0x761c0000, 0x78a80000, 0x9ef00000)
  object space 41728K, 0% used [0x761c0000,0x761c0000,0x78a80000)
 PSPermGen       total 16384K, used 1771K [0x721c0000, 0x731c0000, 0x761c0000)
  object space 16384K, 10% used [0x721c0000,0x7237af00,0x731c0000)


After Iter3
Heap
 PSYoungGen      total 18432K, used 952K [0x9ee80000, 0xa0300000, 0xb3540000)
  eden space 15872K, 6% used [0x9ee80000,0x9ef6e248,0x9fe00000)
  from space 2560K, 0% used [0xa0080000,0xa0080000,0xa0300000)
  to   space 2560K, 0% used [0x9fe00000,0x9fe00000,0xa0080000)
 ParOldGen       total 41728K, used 0K [0x76140000, 0x78a00000, 0x9ee80000)
  object space 41728K, 0% used [0x76140000,0x76140000,0x78a00000)
 PSPermGen       total 16384K, used 1771K [0x72140000, 0x73140000, 0x76140000)
  object space 16384K, 10% used [0x72140000,0x722faf00,0x73140000)
6
  • Make sure to post valid code; everywhere. Commented Sep 18, 2014 at 5:43
  • I have done that... Please look down.. :) Commented Sep 18, 2014 at 5:44
  • May be due to garbage collection? Or the data size varies. Please check size of listData and packList at every iteration. Commented Sep 18, 2014 at 5:46
  • listData is 10 million in every iteration. The critical operations (within the if clause in Function B()) has been mentioned in the logs. Commented Sep 18, 2014 at 5:51
  • Random change, try adding final identifier to functions and Data class declaration. See if that makes any difference. Commented Sep 18, 2014 at 5:57

2 Answers 2

2

The clue is listData.indexOf(d) that makes the algorithm complexity quadratic.
Replace indexOf with a simple loop counter and enjoy the performance difference.

Sign up to request clarification or add additional context in comments.

4 Comments

I tried your suggestion. I commented the indexOf() statement and tried running. Its still having the issue... Have shared the logs in EDIT 1 ...
@StackAddict Try adding -XX:+PrintGCDetails JVM option to see what's happening in the heap.
I performed PrintGCDetails while the code was running at three different times. Have included the results under EDIT 1
@StackAddict I meant adding JVM option to the actual application run. Looks like you've run a separate JVM 3 times.
0

just my 2 cents:

These operations could cause the performance issue...

    List<Data> region = new ArrayList<Data>();
    Data temp = new Data();
    ...
    region.add(..);

I would try these steps:

  • if possible (!), try to alloc region list one time
  • try to alloc region list like this: new ArrayList<Data>(initSize)
  • Data temp is not realy necessary, avoid new try to use primitivs like float, int etc..
  • try to mess execution time without Logger

1 Comment

Thanks for your comments... Actually, I have written a log for critical operations, which is the region size. That is, the number of times the control has entered the inner if() clause. If its zero then, the execution time for the iteration is less than a second. So, the statements in if() clause are affecting the performance. Moreover, I need the region data for processing. Hence, I cannot compromise with region.add() unless there is an alternative

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.