0

I have a DataFrame which have different type of columns. Among those column, i need to retrieve specific column from that DataFrame. Hard coded DataFrame select statement will be like this:

val logRegrDF = myDF.select(myDF("LEBEL_COLUMN").as("label"),
col("FEATURE_COL1"), col("FEATURE_COL2"), col("FEATURE_COL3"), col("FEATURE_COL4"))

Where LEBEL_COLUMN and FEATURE_COLs will be dynamic. I have Array or Seq for those FEATURE Columns like this:

val FEATURE_COL_ARR = Array("FEATURE_COL1","FEATURE_COL2","FEATURE_COL3","FEATURE_COL4")

I need to use this Array of column collection with that SELECT statement in the 2nd part. In the select, 1st column will be one (LABEL_COLUMN) and rest will be dynamic list.

Can you please help me to make the select statement working in SCALA.

Note: The sample code given bellow is working, but i need to add column array in the 2nd part of the SELECT

val colNames = FEATURE_COL_ARR.map(name => col(name))
val logRegrDF = myDF.select(colNames:_*)  // it is not the requirement

I am thinking for 2nd part code will be like this, but it is not working:

val logRegrDF = myDF.select(myDF("LEBEL_COLUMN").as("label"), colNames:_*)

2 Answers 2

2

If I understand your question, I hope this is what you are looking for

val allColumnsArr = "LEBEL_COLUMN" +: FEATURE_COL_ARR
result.select("LEBEL_COLUMN", allColumnsArr: _*)
  .withColumnRenamed("LEBEL_COLUMN", "label")

Hope this helps!

Sign up to request clarification or add additional context in comments.

3 Comments

Hi @Shankar, Thanks a lot. Though your given suggestion is not working, but i got idea from your suggestion and solved the issue by this way val allColumnsArr = "LEBEL_COLUMN" +: FEATURE_COLUMNS val colNames = allColumnsArr.map(name => col(name)) myDF.select(colNames:_*).withColumnRenamed("LEBEL_COLUMN", "label")
Updated the answer, please check. You don't need to columns, the strings are also accepted as arguments.
Yes this works. result.select("LEBEL_COLUMN", FEATURE_COL_ARR: _*) .withColumnRenamed("LEBEL_COLUMN", "label")
1

Thanks a lot @Shankar.

Though your given suggestion is not working, but i got an idea from your suggestion and solved the issue by this way

val allColumnsArr = "LEBEL_COLUMN" +: FEATURE_COL_ARR
val colNames = allColumnsArr.map(name => col(name)) 
myDF.select(colNames:_*).withColumnRenamed("LEBEL_COLUMN", "label")

Also this way without creating DataFrame column:

result.select(LEBEL_COLUMN, FEATURE_COL_ARR: _*) .withColumnRenamed(LEBEL_COLUMN, "label") 

1 Comment

result.select("LEBEL_COLUMN", FEATURE_COL_ARR: _*) .withColumnRenamed("LEBEL_COLUMN", "label") It woks this way also.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.