0

If I have this CSV :

"col1"
"hi"

it is read correctly using this code :

import pandas
df = pandas.read_csv("test.csv")
print(list(df["col1"]))

and prints :

['hi']

But if I change the string "hi" to "null" in the CSV , it fails !

It now prints

[nan]

My actual CSV is quite large and it so happened that it has this string "null" as a field value somewhere, and it cannot be read correctly it seems.

Any workarounds ?

3
  • what do you expect to have instead if [nan] ? Commented Sep 25, 2021 at 11:01
  • I would expect to have ["null"] Commented Sep 25, 2021 at 11:02
  • @Wartin I don't have time to answer properly right now - look at pandas.pydata.org/pandas-docs/stable/reference/api/… - and then search that page for na_values Commented Sep 25, 2021 at 11:03

1 Answer 1

1

Update

using keep_default_na (see here) is the right way to go.

Clumsy Solution below

Using replace can do the job for you. Note that the current code replace all nan values across the df.

You can replace only is specific columns by using

df[['col1']] = df[['col1']].fillna('null')

import pandas as pd
import numpy as np

df = pd.read_csv("test.csv")
print('before:')
print(list(df["col1"]))

df = df.replace(np.nan, 'null', regex=True)
print('after:')
print(list(df["col1"]))

output

before:
[nan]
after:
['null']
Sign up to request clarification or add additional context in comments.

1 Comment

Smart ! keep_default_na=False tested and worked , thank you.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.