1

I'm new to python. In my project I need to concatenate multiple columns of a pandas data frame to create a derived column. My data frame contains few columns with only TRUE & FALSE value. I'm using following code to do the concatenation operation

df_input["combined"] = [' '.join(row) for row in df_input[df_input.columns[0:]].values]

I'm getting following error while running the code

TypeError: sequence item 3: expected str instance, bool found

Can you expert please help me to solve the problem?

Thanks in Advance

1
  • Do you have a sample input and expected output you could update this question with. You will receive better help if you follow MVCE. Commented Jul 26, 2017 at 17:13

2 Answers 2

2

Let's try astype:

df_input["combined"] = [' '.join(row.astype(str)) for row in df_input[df_input.columns[0:]].values]
Sign up to request clarification or add additional context in comments.

Comments

1

You can cast the Bool columns with astype(str) and use a vectorized version to concatenate the columns as follows

from StringIO import StringIO
import pandas as pd

st = """
col1|col2|col3
1|hello|True
4|world|False
7|!|True
"""
df = pd.read_csv(StringIO(st), sep="|")

print("my sample dataframe")
print(df.head())

print("current columns data types")
print(df.dtypes)

print("combining all columns with mixed datatypes") 
df["combined"] = df["col1"].astype(str)+" "+df["col2"]+ " " +df["col3"].astype(str)

print("here's how the data looks now")
print(df.head())

print("here are the new columns datatypes")
print(df.dtypes)

The output of the script:

my sample dataframe
   col1   col2   col3
0     1  hello   True
1     4  world  False
2     7      !   True
current columns data types
col1     int64
col2    object
col3      bool
dtype: object
combining all columns with mixed datatypes
here's how the data looks now
   col1   col2   col3       combined
0     1  hello   True   1 hello True
1     4  world  False  4 world False
2     7      !   True       7 ! True
here are the new columns datatypes
col1         int64
col2        object
col3          bool
combined    object
dtype: object

As you can see the new combined contains the concatenate data.

Dynamic concatenation

To perform the concatenation dynamically, here's how you should edit my previous example:

from StringIO import StringIO
import pandas as pd

st = """
col1|col2|col3
1|hello|True
4|world|False
7|!|True
"""
df = pd.read_csv(StringIO(st), sep="|")

print("my sample dataframe")
print(df.head())

print("current columns data types")
print(df.dtypes)

print("combining all columns with mixed datatypes") 
#df["combined"] = df["col1"].astype(str)+" "+df["col2"]+ " " +df["col3"].astype(str)

all_columns = list(df.columns) 
df["combined"] = "" 

for index, column_name in enumerate(all_columns):
    print("current column {column_name}".format(column_name=column_name))
    df["combined"] = df["combined"] + " " +df[column_name].astype(str)

print("here's how the data looks now")
print(df.head())

print("here are the new columns datatypes")
print(df.dtypes)

2 Comments

Thanks @Scott Boston & @MedAli.But I need to concatenate it dynamically. So, I'm using following code which Scott mentioned df["combined"] = [' '.join(row.astype(str)) for row in df[df.columns[0:]].values] . But problem is that I'm getting repeating output at combined table. For example first row of my output is looking like 1 hello True 1 hello True while I'm using MedAli's data
@PythonLearner check my updated answer to handle dynamic concatenation.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.