I have df with 2 columns: Name, Number. I need to write a row if NaN in cell to a new DataFrame.
path = 'Files/Directory.xlsx'
df = pd.read_excel(path)
I've tried so many different things, spent 3 days and still can't get it.
I have df with 2 columns: Name, Number. I need to write a row if NaN in cell to a new DataFrame.
path = 'Files/Directory.xlsx'
df = pd.read_excel(path)
I've tried so many different things, spent 3 days and still can't get it.
df = pd.DataFrame(
{
"Name": ["Alex", "Bob", "Jim", np.nan, np.nan],
"Number": [1, 2, np.nan, 3, np.nan],
}
)
df
| Name | Number |
|---|---|
| Alex | 1.0 |
| Bob | 2.0 |
| Jim | NaN |
| NaN | 3.0 |
| NaN | NaN |
So it depends if you want to write rows with any NaN values to a new DataFrame or if you just want to write rows with all NaN values to the new DataFrame.
If any, the following should work:
df_nan = df.loc[df.isnull().any(axis=1)]
df_nan
| Name | Number |
|---|---|
| Jim | NaN |
| NaN | 3.0 |
| NaN | NaN |
If all, this should work:
df_nan = df.loc[df.isnull().all(axis=1)]
df_nan
| Name | Number |
|---|---|
| NaN | NaN |
path = '/Files/Directory-All.xlsx' df = pd.read_excel(path) def remove_special_char_from_numbers(number): number_ = str(number["Number"]) return re.sub(r'[^0-9]+', '', number_) If I run: df["Number"] = df.apply(lambda x: remove_special_char_from_numbers(x), axis=1) before: df_nan = df.loc[df.isnull().any(axis=1)] it doesn't work.df["Number"] = df.apply(lambda x: remove_special_char_from_numbers(x), axis=1) before df_nan = df.loc[df.isnull().any(axis=1)] - then I get: `` Name Number 0 NaN 7735633436 89 NaN 94 NaN 154 NaN 194 NaN 203 NaN 209 NaN 252 NaN 261 NaN 319 NaN 365 NaN 691 NaN 714 NaN 824 NaN 869 NaN 870 NaN 871 NaN 921 NaN 954 NaN ``