1

I have two lists as follows:

InputColumns:

List(col1, col2, col3, col4, col5, col6, col7, col8, col9, col10, col11, col12, col13)

InputData:

List(
  Map(col2 -> dummy string, col7 -> 2016-01-01, col11 -> 2011-01-01),
  Map(col2 -> dummy string, col7 -> 2018-01-01, col11 -> 2018-01-01),
  Map(col2 -> dummy string, col7 -> 2018-04-01, col11 -> 2018-04-01),
  Map(col2 -> dummy string, col7 -> 2016-01-01, col11 -> 2016-01-01)
)

What I am trying to do is generate a string after I iterate through them both. so if the colX names match then give it the value in the Map else give it the value NULL.

So in the example above I would loop through 4 times, creating 4 strings that would return:

(Null, dummy string, Null, Null, Null, Null,2016-01-01, Null) ..etc..

I thought of starting as follows. loop through my list of input columns and then loop through each key of my input data but I feel I'm a fair way off.

inputColumns.foreach(column => {
    inputData.foreach{ case (k,v) =>
        // I get a constructor cannot be instantiated to expected type error
    }
})
2
  • What's in the variables col1, ..., col13? Commented Apr 20, 2018 at 10:09
  • nothing, they are just the names of my table columns. I'm trying to build up a spark sql query that will help my insert dictate where the value should be placed in the string I want to build. Commented Apr 20, 2018 at 10:13

2 Answers 2

1

The use of null is generally discouraged in Scala, that is why I can suggest making this mapping to List[Option[String]]. This will allow to benefit securely from functional calls on the transformed data.

So, supposing you have these initial values:

private val columns =
  List("col1", "col2", "col3", "col4", "col5", "col6", "col7", "col8", "col9", "col10", "col11", "col12", "col13")

private val input = List(
  Map("col2" -> "dummy string", "col7" -> "2016-01-01", "col11" -> "2011-01-01"),
  Map("col2" -> "dummy string", "col7" -> "2018-01-01", "col11" -> "2018-01-01"),
  Map("col2" -> "dummy string", "col7" -> "2018-04-01", "col11" -> "2018-04-01"),
  Map("col2" -> "dummy string", "col7" -> "2016-01-01", "col11" -> "2016-01-01")
)

We can transform them in a List of List[Option[String]], where each sub-list corresponds to the original Map:

val rows = input.map(originalMap =>
  columns.map(column => originalMap.get(column))
)

Each row looks like

List(None, Some(dummy string), None, None, None, None, Some(2016-01-01), None, None, None, Some(2011-01-01), None, None)

If you still want to use nulls:

val resultWithNulls = rows.map(row => row.map(_.getOrElse(null)))

gives rows like:

List(null, "dummy string", null, null, null, null, "2016-01-01", null, null, null, "2011-01-01", null, null)

And if you want to tranform optional to CSV-like string, it remains simple:

val resultAsCsvString = rows.map(row => row.map(_.getOrElse("")).mkString(","))
// List(
//  ",dummy string,,,,,2016-01-01,,,,2011-01-01,,",
//  ",dummy string,,,,,2018-01-01,,,,2018-01-01,,",  ...
// )
Sign up to request clarification or add additional context in comments.

Comments

1

Just map the header using each map in the input data. If you want to plug in some values that are not in the map, use getOrElse. This code here:

val col1 = "col1"
val col2 = "col2"
val col3 = "col3"
val col4 = "col4"
val col5 = "col5"
val col6 = "col6"
val col7 = "col7"
val col8 = "col8"
val col9 = "col9"
val col10 = "col10"
val col11 = "col11"
val col12 = "col12"
val col13 = "col13"

val header = List(col1, col2, col3, col4, col5, col6, col7, col8, col9, col10, col11, col12, col13)

val inputData = List(
  Map(col2 -> "dummy string", col7 -> "2016-01-01", col11 -> "2011-01-01"),
  Map(col2 -> "dummy string", col7 -> "2018-01-01", col11 -> "2018-01-01"),
  Map(col2 -> "dummy string", col7 -> "2018-04-01", col11 -> "2018-04-01"),
  Map(col2 -> "dummy string", col7 -> "2016-01-01", col11 -> "2016-01-01")
)

val rows = inputData.map { d =>
  header
    .map { h => d.getOrElse(h, "Null") }
    .mkString("(", ",", ")")
}

rows foreach println

generates the following output:

(Null,dummy string,Null,Null,Null,Null,2016-01-01,Null,Null,Null,2011-01-01,Null,Null)
(Null,dummy string,Null,Null,Null,Null,2018-01-01,Null,Null,Null,2018-01-01,Null,Null)
(Null,dummy string,Null,Null,Null,Null,2018-04-01,Null,Null,Null,2018-04-01,Null,Null)
(Null,dummy string,Null,Null,Null,Null,2016-01-01,Null,Null,Null,2016-01-01,Null,Null)

I'm not sure what you want to do with those strings, though. It's generally advised to avoid stringly-typed serialized-to-string-data at all costs.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.