6

I need to add a number of columns (4000) into the data frame in pyspark. I am using the withColumn function, but getting assertion error.

df3 = df2.withColumn("['ftr' + str(i) for i in range(0, 4000)]", [expr('ftr[' + str(x) + ']') for x in range(0, 4000)])

Eror

Not sure what is wrong.

2 Answers 2

10

We can use .select() instead of .withColumn() to use a list as input to create a similar result as chaining multiple .withColumn()'s. The ["*"] is used to select also every existing column in the dataframe.

import pyspark.sql.functions as F

df2:

+---+
|age|
+---+
| 10|
| 11|
| 13|
+---+

df3 = df2.select(["*"] + [F.lit(f"{x}").alias(f"ftr{x}") for x in range(0,10)])

Results in:

+---+----+----+----+----+----+----+----+----+----+----+
|age|ftr0|ftr1|ftr2|ftr3|ftr4|ftr5|ftr6|ftr7|ftr8|ftr9|
+---+----+----+----+----+----+----+----+----+----+----+
| 10|   0|   1|   2|   3|   4|   5|   6|   7|   8|   9|
| 11|   0|   1|   2|   3|   4|   5|   6|   7|   8|   9|
| 13|   0|   1|   2|   3|   4|   5|   6|   7|   8|   9|
+---+----+----+----+----+----+----+----+----+----+----+
Sign up to request clarification or add additional context in comments.

1 Comment

This is a much more efficient way to do it compared to calling withColumn in a loop!
2

Try to do something like this:

df2 = df3
for i in range(0, 4000):
  df2 = df2.withColumn(f"ftr{i}", lit(f"frt{i}"))

5 Comments

I dont think. it will. it will just add one field-i.e. last one -- ftr3999: string (nullable = false)
@renjith has you actually tried to run it?. The solutions will add all columns. Note that inside the loop I am using df2 = df2.witthColumn and not df3 = df2.withColumn
Yes i ran it. Output when i do printschema is this root |-- hashval: string (nullable = true) |-- dec_spec_str: string (nullable = false) |-- dec_spec array (nullable = true) | |-- element: double (containsNull = true) |-- ftr3999: string (nullable = false)
it works. not sure. why it did not work when i tried first
@renjith How did this looping worked for you. It's not working for me as well.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.