Pandas read_csv() can not read the string "null"

Question

If I have this CSV :

"col1"
"hi"

it is read correctly using this code :

import pandas
df = pandas.read_csv("test.csv")
print(list(df["col1"]))

and prints :

['hi']

But if I change the string "hi" to "null" in the CSV , it fails !

It now prints

[nan]

My actual CSV is quite large and it so happened that it has this string "null" as a field value somewhere, and it cannot be read correctly it seems.

Any workarounds ?

@Wartin I don't have time to answer properly right now - look at pandas.pydata.org/pandas-docs/stable/reference/api/… - and then search that page for na_values — Jon Clements
– Jon Clements, Commented Sep 25, 2021 at 11:03

balderman · Accepted Answer · 2021-09-25 11:21:17Z

1

Update

using keep_default_na (see here) is the right way to go.

Clumsy Solution below

Using replace can do the job for you. Note that the current code replace all nan values across the df.

You can replace only is specific columns by using

df[['col1']] = df[['col1']].fillna('null')

import pandas as pd
import numpy as np

df = pd.read_csv("test.csv")
print('before:')
print(list(df["col1"]))

df = df.replace(np.nan, 'null', regex=True)
print('after:')
print(list(df["col1"]))

output

before:
[nan]
after:
['null']

edited Sep 25, 2021 at 11:21

answered Sep 25, 2021 at 11:10

balderman

24k8 gold badges39 silver badges60 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Wartin Over a year ago

Smart ! keep_default_na=False tested and worked , thank you.

Collectives™ on Stack Overflow

Pandas read_csv() can not read the string "null"

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related