0

I have a struct with a large number of key-value pairs:

|-- struct_col: struct (nullable = false)
|    |-- key1: string (nullable = false)
|    |-- key2: string (nullable = false)
|    |-- key3: string (nullable = false)
|    |-- key4: string (nullable = false)
|    |-- key5: string (nullable = false)
|    |-- (... and so on ...)

I want to turn this into a long string of key-value pairs concatenated together like so:

key1=var1&key2=var2&key3=var3&key4=var4&...

So far I have tried this:

fn.concat_ws("&", *[f"struct_col.{col}" for col in df.select(fn.col("struct_col.*")).columns])

However this only concatenates the values. I know to_json exists using a workflow like this one here, however I would like to use different separators for the key-value pairs and the concatenated struct fields. I would also like to do this dynamically as there is a possibility the struct fields change.

What's the best way to do this?

1 Answer 1

2

Add one more concat_ws inside the list comprehension:

F.concat_ws("&", *[F.concat_ws("=", F.lit(col), F.col(f"struct_col.{col}")) for col in df.select(F.col("struct_col.*")).columns])
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.