0

I have two datasets: one with cancer positive patients (df_pos), and the other with the cancer negative patients (df_neg).

df_pos

    id
0   123
1   124
2   125

df_neg

    id
0   234
1   235
2   236

I want to compile these datasets into one with an extra column if the patient has cancer or not (yes or no).

Here is my desired outcome:

    id  outcome
0   123 yes
1   124 yes
2   125 yes
3   234 no
4   235 no
5   236 no

What would be a smarter approach to compile these?

Any suggestions would be appreciated. Thanks!

2 Answers 2

3

Use pandas.DataFrame.append and pandas.DataFrame.assign:

>>> df_pos.assign(outcome='Yes').append(df_neg.assign(outcome='No'), ignore_index=True)
    id outcome
0  123     Yes
1  124     Yes
2  125     Yes
3  234      No
4  235      No
5  236      No
Sign up to request clarification or add additional context in comments.

1 Comment

That's perfect, @Sayandip. Thanks!
2
df_pos['outcome'] = True
df_neg['outcome'] = False

df = pd.concat([df_pos, df_neg]).reset_index(drop=True)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.