1

For this example, I am using version 1.0.1 of pandas.

I have a DataFrame with mixed types and some missing values:

df = pd.DataFrame(
    [
        [1, 2.0, '2020-01-01', 'A String']
    ], columns = ['int', 'float', 'datetime', 'str']
)
df.loc[1] = [pd.NA, pd.NA, pd.NA, pd.NA]
df.datetime = pd.to_datetime(df.datetime)
print(df)
int   float   datetime    str
0 1   2.0 2020-01-01  A String
1 <NA>    NaN NaT NaN

Let's print the types of the DataFrame to make sure they are what I expect:

print(df.dtypes)
int                 object
float              float64
datetime    datetime64[ns]
str                 object
dtype: object

Now, I want to write this DataFrame to a CSV file:

df.to_csv('test.csv', index=False)

Looking at the output CSV, all NaN values are replaced with an empty string. I guess that this is fine for string columns, but it's not exactly optimal for int, float or datetime columns.

How can I get column-specific representations of the missing values?

EDIT: It is indeed possible to automatically fill missing values using the na_rep argument: df.to_csv('test.csv', na_rep='NA'). However, it does not allow column-specific representations.

SOLUTION: I guess the best solution so far is to call fillna with a dict before writing to CSV:

df.fillna(
    {'int': '<NA>', 'float': 'NaN', 'datetime': 'NaT'}
).to_csv('test.csv', index=False)

3 Answers 3

1

There is no specific CSV format that specifies what the values should be. There are a couple of conventions, but ultimately it is down to the program which will use csv afterwards.

Therefore you should use Pandas fillna function to supply what you want for each data type, before exporting.

Sign up to request clarification or add additional context in comments.

Comments

1

Try this:

df.to_csv('test.csv', index=False,na_rep='NA')

1 Comment

Thanks! The problem with this is that it fills the missing values for each column with the same value.
1

You can use fillna() for specific columns to get what value you want.For example

df['int column'].fillna(0)
df['string column'].fillna("NA")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.