I have a map which is a dict that takes a an int and maps into to a list of ints. I have a polars dataframe column where i would like each int to be replaced by the relevant vector from the map. Should there be some ints that are not in the map, its should be replaced by a list of zeros instead.
I have the following code, with some example data
import polars as pl
data = {
"user_id": [1, 2, 3],
"book_ids": [[101, 102, 103], [104, 105], [106]]
}
# Create DataFrame
read_history_data = pl.DataFrame(data)
# Mapping dictionary
map = {
101: [1, 2],
102: [3, 4],
103: [5, 6],
104: [7, 8],
105: [9, 10],
106: [11, 12]
}
# Padding value and token length
padding_value = 0
token_length = 4
# Column name
column = "book_ids"
# Function to transform the DataFrame
def transform_read_history_data(read_history_data, map, padding_value, token_length, column):
padded_list = [padding_value for i in range(token_length)]
read_history_data = read_history_data.with_columns(
pl.col(column)
.list.eval(pl.element().replace(map, default=None))
.list.eval(pl.element().fill_null(padded_list))
)
return read_history_data
# Run the function
transformed_data = transform_read_history_data(read_history_data, map, padding_value, token_length, column)
# Print the transformed DataFrame
print(transformed_data)
I get:
Traceback (most recent call last):
File "<string>", line 37, in <module>
File "<string>", line 29, in transform_read_history_data
File "c:\...\.venv\Lib\site-packages\polars\dataframe\frame.py", line 9830, in with_columns
return self.lazy().with_columns(*exprs, **named_exprs).collect(_eager=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:...\.venv\Lib\site-packages\polars\_utils\deprecation.py", line 93, in wrapper
return function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\...\.venv\Lib\site-packages\polars\lazyframe\frame.py", line 2224, in collect
return wrap_df(ldf.collect(engine, callback))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
polars.exceptions.ShapeError: argument 2 called 'new' for replace_strict have different lengths (6 != 3)
replace_strictif wanting to specify a default.replacewith a default provided should be emitting a deprecation warning