I am new to spark SQL and Dataframes. I have a Dataframe to which I should be adding a new column based on the values of other columns. I have a Nested IF formula from excel that I should be implementing (for adding values to the new column), which when converted into programmatic terms, is something like this:
if(k =='yes')
{
if(!(i==''))
{
if(diff(max_date, target_date) < 0)
{
if(j == '')
{
"pending" //the value of the column
}
else {
"approved" //the value of the column
}
}
else{
"expired" //the value of the column
}
}
else{
"" //the value should be empty
}
}
else{
"" //the value should be empty
}
i,j,k are three other columns in the Dataframe. I know we can use withColumn and when to add new columns based on other columns, but I am not sure how I can achieve the above logic using that approach.
what would be an easy/efficient way to implement the above logic for adding the new column? Any help would be appreciated.
Thank you.