Create new columns based on unique values of values in pandas [closed]

Question

Closed. This question needs to be more focused. It is not currently accepting answers.

Want to improve this question? Guide the asker to update the question so it focuses on a single, specific problem. Narrowing the question will help others answer the question concisely. You may edit the question if you feel you can improve it yourself. If edited, the question will be reviewed and might be reopened.

Closed 7 years ago.

Improve this question

I have rows which look like this

zipcode   room_type
2011      bed
2012      sofa

Every listing presents one airBNB listing. I want to aggregate the data so that I count all the unique values. Every unique value get's its own column and the data is grouped by zipcode. So the result would looking something like this:

zipcode   bed   sofa    ground
1011      200   36      20
1012      720   45      89

How can I get this result with pandas?

Possible duplicate of pandas transform dataframe pivot table — rpanai
– rpanai, Commented Nov 21, 2018 at 12:03

bdiamante · Accepted Answer · 2017-06-23 18:10:30Z

I've accomplished this using indexes and reshaping:

df = DataFrame({'zipcode':[20110,20110,20111,20111,20111], 'room_type': ['bed','sofa', 'bed','bed','sofa']})
df.set_index(['zipcode', 'room_type'], inplace=True)
df

zipcode room_type
  20110       bed
             sofa
  20111       bed
              bed
             sofa

# count the values and generate a new dataframe
df2 = DataFrame(df.index.value_counts(), columns=['count'])
df2.reset_index(inplace=True)
df2

            index   count
0    (20111, bed)       2
1    (20110, bed)       1
2   (20111, sofa)       1
3   (20110, sofa)       1

# split the tuple into new columns
df2[['zipcode', 'room_type']] = df2['index'].apply(Series)
df2.drop('index', axis=1, inplace=True)

# reshape 
df2.pivot(index='zipcode', columns='room_type', values='count') 

room_type   bed sofa
zipcode     
  20110       1    1
  20111       2    1

skt7 · Accepted Answer · 2018-06-05 23:22:10Z

Firstly apply groupby with the columns 'zipcode' and 'room_type' to get corresponding counts

In [4]: df = df.groupby(['zipcode','room_type'])['room_type'].agg(['count']).reset_index()

In [5]: df
Out[5]: 
   zipcode room_type  count
0    20110       bed      1
1    20110      sofa      1
2    20111       bed      2
3    20111      sofa      1

Now use 'pivot_table' to obtain the desired result

In [6]: df = df.pivot_table(values='count', columns='room_type', index='zipcode')

In [7]: df
Out[7]: 
room_type  bed  sofa
zipcode             
20110        1     1
20111        2     1

Remove columns' name

In [8]: df.columns.name = None

In [9]: df
Out[9]: 
         bed  sofa
zipcode           
20110      1     1
20111      2     1

Finaly reset index

In [10]: df = df.reset_index()

In [11]: df
Out[11]: 
   zipcode  bed  sofa
0    20110    1     1
1    20111    2     1

Arul Dhina · Accepted Answer · 2018-11-21 11:59:15Z

1

crosstab way which i find easy to implement

pd.crosstab(df.zipcode,df.room_type).reset_index()

will do the job

answered Nov 21, 2018 at 11:59

Arul Dhina

577 bronze badges

Collectives™ on Stack Overflow

Create new columns based on unique values of values in pandas [closed]

3 Answers 3

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Linked

Related