Create multiple columns from a data frame using python

Question

I have a csv file, showing below:

I am trying to create column for each title and also trying to create columns for each type of genre_and_votes so that the output is something like below :

My code is given below:

import pandas as pd
df = pd.read_csv("C:\\Users\\mysite\\Desktop\\practice\\book1.csv")
#print(df)
 print(df['Title'].values,df['genre_and_votes'].values)

Now for the code above, it creates a df but not be able to create coulmns for each genre and votes, I am not sure how to do this now, need help.

@Corralien thanks! I provided another solution, I hope you'll like it ;) — mozway
– mozway, Commented Sep 14, 2021 at 14:28
@mozway thanks for your code. Actually I am new in coding so I feel regex is a somehow complicated for an amateur like me, is there any easy way to resolve it? — Robo Bot
– Robo Bot, Commented Sep 14, 2021 at 14:38

Corralien · Accepted Answer · 2021-09-14 14:12:44Z

2

Use str.split and str.rsplit before pivot your dataframe and merge new columns with your original dataframe:

Setup a MRE

df = pd.DataFrame({'title': ['Inner Circle', 'A Time to Embrace'],
                   'genre_and_votes': ['Young adult 161, Mystery 45, Romance 32',
                                       'Christian Fiction 114, Romance 16']})
print(df)


# Output
               title                          genre_and_votes
0       Inner Circle  Young adult 161, Mystery 45, Romance 32
1  A Time to Embrace        Christian Fiction 114, Romance 16

Code:

out = df['genre_and_votes'].str.split(',').explode() \
                           .str.rsplit(' ', 1, expand=True) \
                           .pivot(columns=0, values=1)

df = pd.concat([df.drop(columns='genre_and_votes'), out], axis=1)

Final output

>>> df
               title  Mystery  Romance Christian Fiction Young adult
0       Inner Circle       45       32               NaN         161
1  A Time to Embrace      NaN       16               114         NaN

answered Sep 14, 2021 at 14:12

Corralien

121k8 gold badges43 silver badges68 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Robo Bot Over a year ago

@Corralien, I am getting an error here mentioned below: File "G:\conda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 101, in execfile exec(compile(f.read(), filename, 'exec'), namespace) File "C:/Users/python/Desktop/book1.py", line 13, in <module> out = df['genre_and_votes'].str.split(',').explode() \ File "G:\conda3\lib\site-packages\pandas\core\generic.py", line 3614, in getattr return object.__getattribute__(self, name) AttributeError: 'Series' object has no attribute 'explode'

Corralien Over a year ago

Do you use a version of Pandas < 0.25.0??? You have to update with conda update pandas.

Robo Bot Over a year ago

@Corralien: its 0.22.0

Corralien Over a year ago

It could be hard to get help with an older version and you will be limited without new features of Pandas. Pandas before version 1.0.0 are subject to many changes without keep compatibility with older versions.

mozway · Accepted Answer · 2021-09-14 14:28:30Z

1

Here is a solution using extractall, a regex with named capturing groups, and pivot:

(df.join(df['genre_and_votes'].str.extractall('(?P<genre>[^,]+) (?P<value>\d+)').droplevel('match'))
   .pivot(index='title', columns='genre', values='value')
)

output:

genre              Mystery  Romance Christian Fiction Young adult
title                                                            
A Time to Embrace      NaN       16               114         NaN
Inner Circle            45       32               NaN         161

answered Sep 14, 2021 at 14:28

mozway

267k13 gold badges56 silver badges106 bronze badges

Comments

Rafał Przetakowski · Accepted Answer · 2021-09-14 13:55:01Z

0

There is a "pivot" function https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.pivot.html

answered Sep 14, 2021 at 13:55

Rafał Przetakowski

663 bronze badges

Collectives™ on Stack Overflow

Create multiple columns from a data frame using python

3 Answers 3

4 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related