flatmapping a nested Map in scala

Question

Suppose I have val someMap = Map[String -> Map[String -> String]] defined as such:

val someMap = 
Map( 
    ("a1" -> Map( ("b1" -> "c1"), ("b2" -> "c2") ) ),
    ("a2" -> Map( ("b3" -> "c3"), ("b4" -> "c4") ) ),
    ("a3" -> Map( ("b5" -> "c5"), ("b6" -> "c6") ) )   
)

and I would like to flatten it to something that looks like

List(   
    ("a1","b1","c1"),("a1","b2","c2"),
    ("a2","b3","c3"),("a2","b4","c4"),
    ("a3","b5","c5"),("a3","b6","c6")
)

What is the most efficient way of doing this? I was thinking about creating some helper function that processes each (a_i -> Map(String,String)) key value pair and return

def helper(key: String, values: Map[String -> String]): (String,String,String) 
= {val sublist = values.map(x => (key,x._1,x._2))
return sublist
}

then flatmap this function over someMap. But this seems somewhat unnecessary to my novice scala eyes, so I was wondering if there was a more efficient way to parse this Map.

You mean something like someMap.flatMap { case (a, m) => m.map { case (b, c) => (a, b, c) } }? — Mateusz Kubuszok
– Mateusz Kubuszok, Commented Oct 22, 2021 at 20:42

Guru Stron · Accepted Answer · 2021-10-22 20:56:02Z

4

No need to create helper function just write nested lambda:

val result = someMap.flatMap { case (k, v) => v.map { case (k1, v1) => (k, k1, v1) } }

Or

val y = someMap.flatMap(x => x._2.map(y => (x._1, y._1, y._2)))

edited Oct 22, 2021 at 20:56

answered Oct 22, 2021 at 20:46

Guru Stron

150k11 gold badges186 silver badges232 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Luis Miguel Mejía Suárez Over a year ago

Maybe flatMap instead of flatten?

Guru Stron Over a year ago

@LuisMiguelMejíaSuárez why?

Mateusz Kubuszok Over a year ago

Because it works for wrong reasons: def flatten[B](implicit asIterable: A => scala.collection.IterableOnce[B]): CC[B] - you passed lambda as explicitly passed implicit which should be used to turn "nested map" into "flattened map" and accidentally it compiled and returned IterableOnce (instead of Map or List). In general case you cannot trust this, and this is a hack that abuses the interface.

Aivean · Accepted Answer · 2021-10-22 21:34:23Z

1

Since you're asking about efficiency, the most efficient yet functional approach I can think of is using foldLeft and foldRight.

You need foldRight since :: constructs the immutable list in reverse.

someMap.foldRight(List.empty[(String, String, String)]) { case ((a, m), acc)  =>
  m.foldRight(acc) {
    case ((b, c), acc) => (a, b, c) :: acc
  }
}

Here, assuming Map.iterator.reverse is implemented efficiently, no intermediate collections are created.

Alternatively, you can use foldLeft and then reverse the result:

someMap.foldLeft(List.empty[(String, String, String)]) { case (acc, (a, m))  =>
  m.foldLeft(acc) {
    case (acc, (b, c)) => (a, b, c) :: acc
  }
}.reverse

This way a single intermediate List is created, but you don't rely on the implementation of the reversed iterator (foldLeft uses forward iterator).

Note: one liners, such as someMap.flatMap(x => x._2.map(y => (x._1, y._1, y._2))) are less efficient, as, in addition to the temporary buffer to hold intermediate results of flatMap, they create and discard additional intermediate collections for each inner map.

UPD

Since there seems to be some confusion, I'll clarify what I mean. Here is an implementation of map, flatMap, foldLeft and foldRight from TraversibleLike:

  def map[B, That](f: A => B)(implicit bf: CanBuildFrom[Repr, B, That]): That = {
    def builder = { // extracted to keep method size under 35 bytes, so that it can be JIT-inlined
      val b = bf(repr)
      b.sizeHint(this)
      b
    }
    val b = builder
    for (x <- this) b += f(x)
    b.result
  }

  def flatMap[B, That](f: A => GenTraversableOnce[B])(implicit bf: CanBuildFrom[Repr, B, That]): That = {
    def builder = bf(repr) // extracted to keep method size under 35 bytes, so that it can be JIT-inlined
    val b = builder
    for (x <- this) b ++= f(x).seq
    b.result
  }

  def foldLeft[B](z: B)(op: (B, A) => B): B = {
    var result = z
    this foreach (x => result = op(result, x))
    result
  }

  def foldRight[B](z: B)(op: (A, B) => B): B =
    reversed.foldLeft(z)((x, y) => op(y, x))

It's clear that map and flatMap create intermediate buffer using corresponding builder, while foldLeft and foldRight reuse the same user-supplied accumulator object, and only use iterators.

edited Oct 22, 2021 at 21:34

answered Oct 22, 2021 at 21:10

Aivean

10.9k28 silver badges40 bronze badges

3 Comments

Aivean Over a year ago

@LuisMiguelMejíaSuárez Thank you for pointing out about List.empty. Although, here, since it's called only once, it doesn't matter. However, I disagree with you, regarding flatMap vs foldRight. foldRight uses iterators and doesn't create any intermediate collections at all, while flatMap creates buffer for its results, and also every inner map call creates its own buffer.

Luis Miguel Mejía Suárez Over a year ago

Check this: someMap.iterator.flatMap(x => x._2.iterator.map(y => (x._1, y._1, y._2))).toList - Althought you were right about the intermediate collections I misunderstood your point.

Aivean Over a year ago

@LuisMiguelMejíaSuárez, gotcha. toList here still uses an intermediate mutable buffer and foldRight will be more efficient, if iterator.reversed of the particular Map implementation is as efficient as forward iterator, but I agree, your approach would be effectively the same, if not better, than foldLeft + reverse.

Collectives™ on Stack Overflow

flatmapping a nested Map in scala

2 Answers 2

3 Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related