0

I have a two ArrayList given below as sample. AccVO: compareKey,Amount fields

List 1:

AccVO[001,500]                                                   
AccVO[002,600]                                                   
AccVO[003,800]                                                   

List2:

AccVO[001,100]                                                                   
AccVO[001,100]                                                                   
AccVO[001,300] 
AccVO[003,300]  
AccVO[003,300]  
AccVO[003,200]
AccVO[005,300]  
AccVO[005,300] 

I have sorted the two lists. I have to compare the two lists with the compare key and fetch the records of List2 to insert into database.

Sample Code:

for(AccVO accvo1 : List1){
    for(AccVO accvo2 : List2){
    if(accvo1.getCmpkey().equals(accvo2.getCmpkey())){     
            //insert the recs into the table            
           }        
     }
}

Since my list size will be larger i.e handling millions of records, i need some optimistic logic in looping the records.

Thanking you in advance Prasanna

4
  • 1
    Maps aren't a option for both lists? Commented Dec 8, 2009 at 8:14
  • Not everyone is able to understand what "lakhs" really is? Its better to consent with the standard language, unless there is no other better substitute. But then a brief definition is necessary. Commented Dec 8, 2009 at 8:20
  • yeah, but i was asking about the code formatting - your edit removed it. Commented Dec 8, 2009 at 8:24
  • In English, "lakhs" sounds like "lax" which means "extremely negligent". Be careful :) Commented Dec 8, 2009 at 19:13

5 Answers 5

5

Because your lists are sorted, you can use an index into both arrays and increment only the smaller key each time:

int i = 0,j = 0;

while (i < List1.size() && j < List2.size()){
  int x = List1.get(i).getCmpKey().CompareTo(List2.get(j).getCmpKey();
  if (x == 0){ 
    //add record
    j++;
  }else if(x < 0){
    i++;
  }else{
    j++;
  }
}
Sign up to request clarification or add additional context in comments.

6 Comments

+1 - given that the inputs are already sorted, this is O(N), compared with O(NlogN) for binary search and O(N) with O(N) extra space for HashSets. The only case where the other approaches might be better is when one of the lists is tiny compared with the other one.
I think you forgot an extra check in the loop for j < list2.size(). I'd prefer a simple solution that uses some more memory over one that's "complicated" enough for such bugs to slip in...
@Stephen - Wouldn't it be O(N+M), O(NlogM), O(N) and O(N+M) respectively?
@Wouter: Everything should be made as simple as possible, but not simpler
you should add j++; in the add record part
|
3

If your equals (and hashCode) implementation are based on getCmpkey() you can use Sets.

set1 = new HashSet(list1)
set2 = new HashSet(list2)
set2.retainAll(set1);
for(AccVO a : set2) //insert

This will have O(1) for individual removes (O(n) for n elements in set1).

4 Comments

This will use a bit of extra memory for the hashsets. But unless this is really performance critical code, I prefer this solution over the one from steadfastbuck because it is a lot easier to read (and less tricky, and shorter).
Actually, the bit of extra memory will be a lot more. A HashSet is a fancy HashMap wrapper. So n entries will have n Entry objects, arrays for buckets etc. That's a lot of work for the garbage collector.
Dos HashSet work for the second list? We have multiple records for the same compareKey value. But I don't understand this AccVO object (array?) anyway.
@Andreas_D I suppose this is a glitch in the question. The database will not be happy if you tried to insert a non-unique compound key anyway.
0

I would suggest using a HashMap for the second list.

Comments

0

If the type of the Lists is not specified you should use Iterators to traverse them. This will give you guaranteed O(n) (linear) performance even if you use a List implementation that's not backed by an Array (assuming it you can iterate in O(n) time).

For example, if you happened to be given a LinkedList as your list class, the following implementation is still O(n) but an implementation that used get() to index into the list would be O(n^2). So if you list sizes were each 1 million, this implementation would be 1 million times faster than an indexing implementation.

Iterator i1 = list1.iterator();
Iterator i2 = list2.iterator();
if (i1.hasNext() && i2.hasNext() {
   AccVO v1 = i1.next();
   AccVO v2 = i2.next();
   while (true) {
      int i = v1.getCmpKey().compareTo(v2.getCmpKey());
      if (i == 0) {

         // insert record into DB here

         if (i2.hasNext()) {
            v2 = i2.next()
         } else {
            break;
         } 
      } else if (i < 0) {
         if (i1.hasNext()) {
            v1 = i1.next();
         } else {
            break;   
         }
      } else {
         if (i2.hasNext()) {
            v2 = i2.next();
         } else {
            break;
         }
      }
   }
}

Comments

0

I think I'd do something along the lines of

Set<Integer> goodKeys = new HashSet<Integer>(); 
for (AccVO a : List1) 
    goodKeys.add(a.getCmpkey());
for (AccVO a : List2)
    if (goodKeys.contains(a.getCmpkey()))
        // insert the recs into the table

I could then hang on to the list of good keys, if desired, extract the first chunk into getKeys(List<AccVO> list), extract the remainder into insertRecordsForKeys(Set<Integer> keys), etc. Easier to test, easier to debug, easier to reuse, easier to work with.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.