1

I was wondering if I can tune the following Scala code :

def removeDuplicates(listOfTuple: List[(Class1,Class2)]): List[(Class1,Class2)] = {
           var listNoDuplicates: List[(Class1, Class2)] = Nil
           for (outerIndex <- 0 until listOfTuple.size) {
             if (outerIndex != listOfTuple.size - 1)
               for (innerIndex <- outerIndex + 1 until listOfTuple.size) {
                 if (listOfTuple(i)._1.flag.equals(listOfTuple(j)._1.flag))
                   listNoDuplicates = listOfTuple(i) :: listNoDuplicates
               }
           }
           listNoDuplicates
         }
8
  • What does tune means ? Commented Aug 4, 2011 at 9:27
  • I mean to get the shape of functional style . I need to remove the var and use val instead Commented Aug 4, 2011 at 9:28
  • OK. Could you also explain what is precisely the expected behavior of the method ? Commented Aug 4, 2011 at 9:31
  • I have a list of tuples . I need to remove the duplicates but based on the condition of a flag that existed on both class , i.e. class1 ,Class2 Commented Aug 4, 2011 at 9:33
  • In case of several tuples with same flag, which one should you keep ? Commented Aug 4, 2011 at 9:35

3 Answers 3

5

Usually if you have someting looking like:

var accumulator: A = new A
for( b <- collection ) {
    accumulator = update(accumulator, b)
}
val result = accumulator

can be converted in something like:

val result = collection.foldLeft( new A ){ (acc,b) => update( acc, b ) }

So here we can first use a map to force the unicity of flags. Supposing the flag has a type F:

val result = listOfTuples.foldLeft( Map[F,(ClassA,ClassB)] ){
            ( map, tuple ) => map + ( tuple._1.flag -> tuple )
          }

Then the remaining tuples can be extracted from the map and converted to a list:

val uniqList = map.values.toList

It will keep the last tuple encoutered, if you want to keep the first one, replace foldLeft by foldRight, and invert the argument of the lambda.

Example:

case class ClassA( flag: Int )
case class ClassB( value: Int )

val listOfTuples = 
  List( (ClassA(1),ClassB(2)), (ClassA(3),ClassB(4)), (ClassA(1),ClassB(-1)) )

val result = listOfTuples.foldRight( Map[Int,(ClassA,ClassB)]() ) {
  ( tuple, map ) => map + ( tuple._1.flag -> tuple )
}

val uniqList = result.values.toList

//uniqList: List((ClassA(1),ClassB(2)), (ClassA(3),ClassB(4)))

Edit: If you need to retain the order of the initial list, use instead:

val uniqList = listOfTuples.filter( result.values.toSet )
Sign up to request clarification or add additional context in comments.

7 Comments

This is amazing approach . I was trying it but as u can see my method takes List[(Class1,Class2)] . By applying ur approach I couldn't get the expected result !
@Echo: I've added a compilable example. It works like a charm and the complexity is O(n) thanks to the map.
Warning: Map is not supposed to keep insertion order. Consequently, you do not ensure that the result will be in the same order than the initial list!
@Nicolas: You are right. I edited the answer to keep the insertion order. Still O(n) I think.
With your edition, it does not remove strict duplicates anymore if they are strictly equals ex: List((1, 2), (2, 3), (1, 2)). Btw, the complexity depends of the complexity of map insert (O(n)) so... your method is O(n^2)
|
2

This compiles, but as I can't test it it's hard to say if it does "The Right Thing" (tm):

def removeDuplicates(listOfTuple: List[(Class1,Class2)]): List[(Class1,Class2)] = 
  (for {outerIndex <- 0 until listOfTuple.size
     if outerIndex != listOfTuple.size - 1
     innerIndex <- outerIndex + 1 until listOfTuple.size
     if listOfTuple(i)._1.flag == listOfTuple(j)._1.flag
  } yield listOfTuple(i)).reverse.toList

Note that you can use == instead of equals (use eq if you need reference equality).

BTW: https://codereview.stackexchange.com/ is better suited for this type of question.

Comments

1

Do not use index with lists (like listOfTuple(i)). Index on lists have very lousy performance. So, some ways...

The easiest:

def removeDuplicates(listOfTuple: List[(Class1,Class2)]): List[(Class1,Class2)] =
  SortedSet(listOfTuple: _*)(Ordering by (_._1.flag)).toList

This will preserve the last element of the list. If you want it to preserve the first element, pass listOfTuple.reverse instead. Because of the sorting, performance is, at best, O(nlogn). So, here's a faster way, using a mutable HashSet:

def removeDuplicates(listOfTuple: List[(Class1,Class2)]): List[(Class1,Class2)] = {
  // Produce a hash map to find the duplicates
  import scala.collection.mutable.HashSet
  val seen = HashSet[Flag]()

  // now fold
  listOfTuple.foldLeft(Nil: List[(Class1,Class2)]) {
    case (acc, el) =>
      val result = if (seen(el._1.flag)) acc else el :: acc
      seen += el._1.flag
      result
  }.reverse
}

One can avoid using a mutable HashSet in two ways:

  1. Make seen a var, so that it can be updated.
  2. Pass the set along with the list being created in the fold. The case then becomes:

    case ((seen, acc), el) =>
    

1 Comment

Thx Daniel for ur assistance .

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.