2

I have the following dataframe

data= {
    'a_index':[55, 72, 112, 55, 53, 100, 89],
    'make':['TY', 'FD', 'TA', 'HA', 'MA', 'BW', 'VN'],
    'p_index':[120, 70, 120, 128, 180, 172, 150],
    'score':['2.3,1.3,3.2,3.4','2.7,4.3, 4.2,3.4','2.3,4.3, 4.2,,3.4', '2.3,4.3, 4.2,3.4', '1.3,5.3, 7.2,3.4', '2.3,4.3, 4.2,3.4', '2.3,4.3,4.2,3.4'],
}
df = pd.DataFrame(data,
                  index=['NK', 'JN', 'NA', 'PP', 'DK', 'HA', 'CK'])
df

which gives me

    a_index make  p_index   score
NK  55      TY     120      2.3,1.3,3,2,3.4
JN  72      FD     70       2.7,4.3, 4,2,3.4
NA  112     TA     120      2.3,4.3, 4,2,,3.4
PP  55      HA     128      2.3,4.3, 4,2,3.4
DK  53      MA     180      1.3,5.3, 7,2,3.4
HA  100     BW     172      2.3,4.3, 4,2,3.4
CK  89      VN     150      2.3,4.3,4,2,3.4

What is the easiest way to the following dataframe from this dataframe

    a_index make    p_index score             sore_1    sore_2  sore_3  sore_4
NK   55     TY      120      2.3,1.3,3,2,3.4    2.3     1.3     3.2      3.4
JN   72     FD      70       2.7,4.3, 4,2,3.4   2.7     4.3     4.2      3.4
NA   112    TA      120      2.3,4.3, 4.2,3.4   2.3     4.3     4.2      3.4
PP   55     HA      128      2.3,4.3, 4.2,3.4   2.3     4.3     4.2      3.4
DK   53     MA      180      1.3,5.3, 7,2,3.4   1.3     5.3     7.2      3.4
HA   100    BW      172      2.3,4.3, 4.2,3.4   2.3     4.3     4.2      3.4
CK   89     VN      150      2.3,4.3,4.2,3.4    2.3     4.3     4.2      3.4
3
  • instead of 2.3,4.3, 4,2,,3.4 should be 2.3,4.3, 4,2, 3.4? Commented Mar 22, 2020 at 18:11
  • Yes, that is right. Corrected. Commented Mar 22, 2020 at 18:20
  • 1
    @kederrac, Ukrainian-serge and Samira Kumar thank you for your time and answering my question. Appreciate it. All three work for the latest pandas version. I have upvoted all three answers but can accept only one answer. Commented Mar 23, 2020 at 18:18

3 Answers 3

4

you can use:

pd.concat(
    [
        df, 
        df['score'].str.split(',', expand=True).rename(
            lambda x: f'score_{x}',axis='columns')
    ], axis=1)

enter image description here

Sign up to request clarification or add additional context in comments.

4 Comments

u could replace the rename section with ... add_prefix('score_')
but is more readable using rename with axis=columns there are many ways to do it, and I guess is subjective
Great. This worked perfectly with latest version of pandas. I liked your solution. But how can we write the same functionality in pandas 0.17? This is the version of pd I use in prod.
@Ayalew thx, but as I was saying this is a task for another question
3

You can try using this.

df['score'].str.split(',').apply(pd.Series).rename(columns = {0:'score_1',1:'score_2',2:'score_3',3:'score_4'})

    score_1 score_2 score_3 score_4
NK  2.3 1.3 3.2 3.4
JN  2.7 4.3 4.2 3.4
NA  2.3 4.3 4.2 3.4
PP  2.3 4.3 4.2 3.4
DK  1.3 5.3 7.2 3.4
HA  2.3 4.3 4.2 3.4
CK  2.3 4.3 4.2 3.4

and then merge it back to original dataframe.

Comments

3

Try:

exploded = df.score.apply(lambda x: pd.Series(x.split(',')))       # use explode

exploded.columns = ['score_'+str(col) for col in exploded.columns] # rename columns

pd.concat([df, exploded], axis=1)                                  # concat to original df
print(df)

    a_index make  p_index             score score_0 score_1 score_2 score_3 score_4
NK       55   TY      120   2.3,1.3,3,2,3.4     2.3     1.3       3       2     3.4
JN       72   FD       70  2.7,4.3, 4,2,3.4     2.7     4.3       4       2     3.4
NA      112   TA      120  2.3,4.3, 4,2,3.4     2.3     4.3       4       2     3.4
PP       55   HA      128  2.3,4.3, 4,2,3.4     2.3     4.3       4       2     3.4
DK       53   MA      180  1.3,5.3, 7,2,3.4     1.3     5.3       7       2     3.4
HA      100   BW      172  2.3,4.3, 4,2,3.4     2.3     4.3       4       2     3.4
CK       89   VN      150   2.3,4.3,4,2,3.4     2.3     4.3       4       2     3.4

2 Comments

This is great. Do you know how to do the same with pandas 0.17
Sorry I didn't see your version of Python. When I have time I'll create a Python 2.7 env and post solution.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.