0

I'm trying to concatenate two columns and the second column has a few noneType values. When I try to concatenate both the columns with the noneType values, the resulting column results in "NaN".

I tried to look around to see if I could find questions on this behavior, but I wasn't able to.

Here's what the table looked before concatenation:

enter image description here

Here's my code to join the two columns after my modifications:

new_table["name"] = new_table[0] + new_table[1]

Which results in this:

enter image description here

Why is does concatenation result in "NaN" and how can I fix it?

3
  • 1
    What's the expected output? Commented Jan 6, 2020 at 0:45
  • Please share code and data as text in the post itself, not as images. See: meta.stackoverflow.com/q/303812/11301900. Have you read the Pandas docs? and how can I fix it? There's nothing to fix, NaN has a purpose. Speaking of, why were you using None in the DataFrame? Commented Jan 6, 2020 at 1:37
  • Noted for the images. I have read the pandas docs here: pandas.pydata.org/pandas-docs/stable/getting_started/…, but I couldn't find an explanation for this behavior. NaN has a function, but is unclear why the string object '+' noneType results in NaN. Since the result is NaN, as opposed to just the data from column one as a summing operation would suggest, I wanted to know how to "fix" it. Commented Jan 6, 2020 at 3:45

2 Answers 2

2

The most simple fix would be to replace None with empty string:

new_table["name"] = new_table[0] + new_table[1].fillna('')
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you, this solves the problem. But is there a reason why concatenation between noneTypes and strings works this way?
Python does not allow concatenation between str and any other types (including NoneType). Pandas catches the TypeError and falls back to NaN. The burden is on you to tell Pandas how to deal with missing values, to avoid silent failure.
2
df = pd.DataFrame([["K.", "Mbappe"], ["N.", np.nan]])
print (df)

Output:

    0       1  
0  K.  Mbappe  
1  N.     NaN  


df['Name'] = df[0].str.cat(df[1], na_rep='')
print(df)

Output:

    0       1      Name
0  K.  Mbappe  K.Mbappe
1  N.     NaN        N.

It is the same approach as ypnos proposed, using Series str.cat function instead.

2 Comments

A helpful addition to my answer, as str.cat is more powerful than plain str concatenation.
Thank you @Ponx. Is there a reason str.cat is more powerful?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.