0

I have a dataframe with repetitive names of people

Input:

{'name': {0: 'John Smith', 1: 'John Smith', 2: 'John Smith', 3: 'John Doo', 4: 'John Doo', 5: 'John Doo'}, 'journal': {0: 'Journal1', 1: 'Journal2', 2: 'Journal2', 3: 'Journal1', 4: 'Journal2', 5: 'Journal2'}, 'is_editor_in_chief_2019': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan}, 'is_editor_in_chief_2020': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan}, 'is_editor_in_chief_2021': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan}, 'is_editor_in_chief_2022': {0: 1.0, 1: nan, 2: nan, 3: 1.0, 4: nan, 5: nan}, 'is_editorial_board_member_2019': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan}, 'is_editorial_board_member_2020': {0: nan, 1: nan, 2: nan, 3: nan, 4: nan, 5: nan}, 'is_editorial_board_member_2021': {0: nan, 1: nan, 2: 1.0, 3: nan, 4: nan, 5: 1.0}, 'is_editorial_board_member_2022': {0: nan, 1: 1.0, 2: nan, 3: nan, 4: 1.0, 5: nan}}

I want to collapse these rows and convert data from long to wide.

Expected output:

{'name': {0: 'John Smith', 1: 'John Doo'}, 'Journal1_is_editor_in_chief_2019': {0: nan, 1: nan}, 'Journal1_is_editor_in_chief_2020': {0: nan, 1: nan}, 'Journal1_is_editor_in_chief_2021': {0: nan, 1: nan}, 'Journal1_is_editor_in_chief_2022': {0: 1, 1: 1}, 'Journal1_is_editorial_board_member_2019': {0: nan, 1: nan}, 'Journal1_is_editorial_board_member_2020': {0: nan, 1: nan}, 'Journal1_is_editorial_board_member_2021': {0: 1, 1: 1}, 'Journal1_is_editorial_board_member_2022': {0: 1, 1: 1}, 'Journal2_is_editor_in_chief_2019': {0: nan, 1: nan}, 'Journal2_is_editor_in_chief_2020': {0: nan, 1: nan}, 'Journal2_is_editor_in_chief_2021': {0: nan, 1: nan}, 'Journal2_is_editor_in_chief_2022': {0: nan, 1: nan}, 'Journal2_is_editorial_board_member_2019': {0: nan, 1: nan}, 'Journal2_is_editorial_board_member_2020': {0: nan, 1: nan}, 'Journal2_is_editorial_board_member_2021': {0: nan, 1: nan}, 'Journal2_is_editorial_board_member_2022': {0: nan, 1: nan}}

The closest solution I found is here:

df_out = df.set_index(['name', df.groupby(['name']).cumcount()+1]).unstack().sort_index(level=1, axis=1)
df_out.columns = df_out.columns.map('{0[0]}_{0[1]}'.format)
df_out.reset_index()

But it does not give me the ideal output.

0

1 Answer 1

1

You can use pivot_table:

df_out = df.pivot_table(index='name', columns='journal', dropna=False)
df_out.columns = df_out.columns.map('{0[1]}_{0[0]}'.format)
df_out.reset_index()

output:

         name  Journal1_is_editor_in_chief_2019  Journal2_is_editor_in_chief_2019  Journal1_is_editor_in_chief_2020  Journal2_is_editor_in_chief_2020  Journal1_is_editor_in_chief_2021  Journal2_is_editor_in_chief_2021  Journal1_is_editor_in_chief_2022  Journal2_is_editor_in_chief_2022  Journal1_is_editorial_board_member_2019  Journal2_is_editorial_board_member_2019  Journal1_is_editorial_board_member_2020  Journal2_is_editorial_board_member_2020  Journal1_is_editorial_board_member_2021  Journal2_is_editorial_board_member_2021  Journal1_is_editorial_board_member_2022  Journal2_is_editorial_board_member_2022
0    John Doo                               NaN                               NaN                               NaN                               NaN                               NaN                               NaN                               1.0                               NaN                                      NaN                                      NaN                                      NaN                                      NaN                                      NaN                                      1.0                                      NaN                                      1.0
1  John Smith                               NaN                               NaN                               NaN                               NaN                               NaN                               NaN                               1.0                               NaN                                      NaN                                      NaN                                      NaN                                      NaN                                      NaN                                      1.0                                      NaN                                      1.0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.