0

I have a dataframe called df that looks like this:

Provider fid pid datetime score system
CHE-229 2bfc9a62 2f43d557 2021-09-26 -3.0 SOFA
CHE-229 78d5d845 88c59d92 2021-09-12 -4.0 SAPS

I would like to create a new column specific to the values from system. e.g. I want to create a new column called SOFA and another column called SAPS with their respective scores in their table.

The output I want is:

Provider fid pid datetime SOFA SAPS
CHE-229 2bfc9a62 2f43d557 2021-09-26 -3.0
CHE-229 78d5d845 88c59d92 2021-09-12 -4.0

2 Answers 2

1

You can get this with an iterative procedure using numpy.where() to choose the value depending on a condition, and then dropping the original columns:

for sys in df["system"].unique():
    df[sys] = np.where(df["system"] == sys, df["score"], None)
df = df.drop(columns=["system", "score"])
Sign up to request clarification or add additional context in comments.

Comments

0
df.pivot(df.columns[:-2], 'system', 'score').fillna('').reset_index()

system Provider       fid       pid    datetime SAPS SOFA
0       CHE-229  2bfc9a62  2f43d557  2021-09-26      -3.0
1       CHE-229  78d5d845  88c59d92  2021-09-12 -4.0     

If you want them as numeric, then you can remove the fillna() part or even fill the nan with 0. ie .fillna(0)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.