8

I have a pandas dataframe with boolean values, i.e.

    col1   col2
1   True   False
2   False  True
3   True   True

when I use pandas' DataFrame.to_csv method, the resulting dataframe looks like

,col1,col2
1,True,False
2,False,True
3,True,True

is there a way to write the boolean variables as 1s and 0s (more space-efficient), i.e.

,col1,col2
1,1,0
2,0,1
3,1,1

without having to cast the entire dataframe first?

2 Answers 2

9

It's quite simple actually, just multiply the df by 1.

import pandas as pd
import io

data = """
    col1   col2
1   True   False
2   False  True
3   True   True
    """

df = pd.read_csv(io.StringIO(data), delimiter='\s+')

print(df*1)

This will change it to:

   col1  col2
1     1     0
2     0     1
3     1     1

From there you can either reassign the df from within the code by doing df = df*1 or df2 = df*1. The first will prevent duplicate copy.

Sign up to request clarification or add additional context in comments.

1 Comment

This is nice because it also leaves strings as they were.
8

You can just convert the dtype of the df to int this will convert True to 1 and False to 0:

In [16]:
df.astype(int)

Out[16]:
   col1  col2
1     1     0
2     0     1
3     1     1

1 Comment

I prefer this solution over @Leb 's solution, it is much more explicit.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.