The documentation indicates that this value cannot be a list.
Value to use to fill holes (e.g. 0), alternately a
dict/Series/DataFrame of values specifying which value to use for each
index (for a Series) or column (for a DataFrame). Values not in the
dict/Series/DataFrame will not be filled. This value cannot be a list.
This is probably a limitation of the current implementation, and short of patching the source code you must resort to workarounds (as provided below).
However, if you are not planning to work with jagged arrays, what you really want to do is probably replace pd.Series() with pd.DataFrame(), e.g.:
import numpy as np
import pandas as pd
s = pd.DataFrame(
[[1, 2, 3],
[1, 2, 3],
[np.nan],
[1, 2, 3],
[1, 2, 3],
[np.nan]],
dtype=pd.Int64Dtype()) # to mix integers with NaNs
s.fillna(0)
# 0 1 2
# 0 1 2 3
# 1 1 2 3
# 2 0 0 0
# 3 1 2 3
# 4 1 2 3
# 5 0 0 0
If you do need to use jagged array, you could use any of the proposed workaround from other answers, or you could make one of your attempt work, e.g.:
ii = s.isna()
nn = ii.sum()
s[ii] = pd.Series([[0, 0, 0]] * nn).to_numpy()
# 0 [1, 2, 3]
# 1 [1, 2, 3]
# 2 [0, 0, 0]
# 3 [1, 2, 3]
# 4 [1, 2, 3]
# 5 [0, 0, 0]
# dtype: object
which basically uses NumPy masking to fill in the Series. The trick is to generate a compatible object for the assignment that works at the NumPy level.
If there are too many NaNs in the input, it is probably more efficient / faster to work in a similar way but with s.notna() instead, e.g.:
import pandas as pd
result = pd.Series([[0, 0, 0]] * len(s))
result[s.notna()] = s[s.notna()]
Let's try to do some benchmarking, where:
replace_nan_isna() is from above
import pandas as pd
def replace_nan_isna(s, value, inplace=False):
if not inplace:
s = s.copy()
ii = s.isna()
nn = ii.sum()
s[ii] = pd.Series([value] * nn).to_numpy()
return s
replace_nan_notna() is also from above
import pandas as pd
def replace_nan_notna(s, value, inplace=False):
if inplace:
raise ValueError("In-place not supported!")
result = pd.Series([value] * len(s))
result[s.notna()] = s[s.notna()]
return result
def replace_nan_reindex(s, value, inplace=False):
if not inplace:
s = s.copy()
s.dropna().reindex(s.index, fill_value=value)
return s
import pandas as pd
def replace_nan_fillna(s, value, inplace=False):
if not inplace:
s = s.copy()
s.fillna(pd.Series([value] * len(s), index=s.index))
return s
with the following code:
import numpy as np
import pandas as pd
def gen_data(n=5, k=2, p=0.7, obj=(1, 2, 3)):
return pd.Series(([obj] * int(p * n) + [np.nan] * (n - int(p * n))) * k)
funcs = replace_nan_isna, replace_nan_notna, replace_nan_reindex, replace_nan_fillna
# : inspect results
s = gen_data(5, 1)
for func in funcs:
print(f'{func.__name__:>20s} {func(s, value)}')
print()
# : generate benchmarks
s = gen_data(100, 1000)
value = (0, 0, 0)
base = funcs[0](s, value)
for func in funcs:
print(f'{func.__name__:>20s} {(func(s, value) == base).all()!s:>5}', end=' ')
%timeit func(s, value)
# replace_nan_isna True 100 loops, best of 5: 16.5 ms per loop
# replace_nan_notna True 10 loops, best of 5: 46.5 ms per loop
# replace_nan_reindex True 100 loops, best of 5: 9.74 ms per loop
# replace_nan_fillna True 10 loops, best of 5: 36.4 ms per loop
indicating that reindex() may be the fastest approach.