3

I'm looking for a way to apply a user defined function taking a dictionary, and not a tuple, of arguments as input when using pl.DataFrame.map_rows.

Trying something like

df.map_rows(lambda x: udf({k:v for k, v in zip(df.columns, x)}))

I'm getting a RuntimeError: Already mutably borrowed

In the doc it is said that :

The frame-level map_rows cannot track column names (as the UDF is a black-box that may arbitrarily drop, rearrange, transform, or add new columns); if you want to apply a UDF such that column names are preserved, you should use the expression-level map_elements syntax instead.

But how does this prevent polars to pass a dict and not a tuple to the udf ? Just like calling df.row(i, named=True). Why the struct can't be named ?

I know I can iterate trough df.rows() and do my user-defined stuff, then convert back to pl.DataFrame, but I would have liked a way to do this without leaving the polars API.

1 Answer 1

1

I don't know enough about the underlying rust dynamics, but capturing df.columns before calling map_rows seems to work.

cols = df.columns
df.map_rows(lambda x: udf({k:v for k, v in zip(cols, x)}))

Moreover, you can simplify the creation of the dictionary by using the dict() constructor.

cols = df.columns
df.map_rows(lambda x: udf(dict(zip(cols, x))))
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.