Count of different objects in a column and plotting the data

Question

Total Newby to Python.

I have a EV Car dataset that I have read from a csv file. I want to do a count number of sales for each brand (column name is "Make") and then plot them against each other. E.g. Dataset contains approx.. 105,000 rows of data with 10 different brand names E.g. Tesla could occur in 40,000 rows VW could occur in 30,000 rows Kia could occur in 35,000 rows

The data in each row is an object as per .dtypes. see screen shot.

I think I have gotten the dataset to the required state. I have done some of the basics. Created a DataFrame, sorted the data, checked for duplicates, removed blank data.

My first few lines of code:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
plt.style.use('seaborn')

This is as far as I have gotten from the analysis point of view

brand_sales = EVs_by_Population.groupby('Make')
brand_sales.plot(kind='bar')
plt.show()

Nothing happens thank you

not sorted but you get the idea of the dataset Data Types

I think you are looking for either a seaborn countplot if you dont mind using that or, if you need it to be matplotlib plot, you will need to use value_counts like explained here — Redox
– Redox, Commented May 12, 2023 at 16:04
I tried this but it still does not work: plt.figure(figsize = (15, 5)) ax = sns.countplot(EVs_by_Population = EVs_by_Population, x='Make', order = EVs_by_Population.groupby('Make')['Model'].count().sort_values(ascending = False).index, orient = 'v') plt.xticks(rotation = 45) labels = EVs_by_Population.groupby('Make')['Model'].count().sort_values(ascending = False).values # plt.bar_label(ax.containers[0], labels = labels) plt.tight_layout() plt.show() — dcgray
– dcgray, Commented May 12, 2023 at 20:45
the first part of the error: ValueError Traceback (most recent call last) <ipython-input-44-12ddfb2b700f> in <module> 2 plt.figure(figsize = (15, 5)) 3 ----> 4 ax = sns.countplot(EVs_by_Population = EVs_by_Population, x='Make', order = EVs_by_Population.groupby('Make')['Model'].count().sort_values(ascending = False).index, orient = 'v') 5 6 # Rotating the x data labels (Vehicle Manufacturer) by 45 degrees and — dcgray
– dcgray, Commented May 12, 2023 at 20:47

Redox · Accepted Answer · 2023-05-13 13:25:22Z

1

I think you are trying to make it complicated by grouping the data. I am giving an example below. Please see if this is what you need.

DATA

This will create a simple one column dataframe with few car brands

import pandas as pd
data=['Volvo']*10 + ['Tesla']*5 + ['BMW']*3 + ['Jaguar']*8
df=pd.DataFrame({'Make':data})

Plotting - Seaborn

import seaborn as sns
sns.countplot(data=df, x='Make', order = df['Make'].value_counts().index)

Output plot

Plotting - Matplotlib

import matplotlib.pyplot as plt
counts = df["Make"].value_counts().sort_values(ascending=False)
plt.bar(counts.index, counts.values)
plt.show()

Output plot

Another attempt with your data

Here is the data with 5 rows you provided below...

Running this code...

import seaborn as sns
sns.countplot(data=EVs_by_Population, x='Make', order = EVs_by_Population['Make'].value_counts().index)

...will give you this plot. Is this not what you are seeing?

edited May 13, 2023 at 13:25

answered May 13, 2023 at 3:17

Redox

10.1k5 gold badges11 silver badges28 bronze badges

Sign up to request clarification or add additional context in comments.

13 Comments

dcgray Over a year ago

Thanks a million for your help but that did not work. The first block of code did not work I am assuming that I would replace data with the name of my data frame (EVs_by_Population) and replace "df" with a new name in my code. this is what I did: EVs_by_Brand=pd.DataFrame({'Make':EVs_by_Population})

dcgray Over a year ago

ValueError Traceback (most recent call last) <ipython-input-57-30b1cd6abb8d> in <module> ----> 1 EVs_by_Brand=pd.DataFrame({'Make':EVs_by_Population})

dcgray Over a year ago

~\Anaconda3\lib\site-packages\pandas\core\frame.py in __init__(self, data, index, columns, dtype, copy) 527 528 elif isinstance(data, dict): --> 529 mgr = init_dict(data, index, columns, dtype=dtype) 530 elif isinstance(data, ma.MaskedArray): 531 import numpy.ma.mrecords as mrecords

Redox Over a year ago

Try sns.countplot(data=EVs_by_Brand, x='Make', order = EVs_by_Brand['Make'].value_counts().index). Assuming the dataframe is the inital dataframe without grouping of the data... just your basic dataframe. The data in my example was there because you have not shared your data

dcgray Over a year ago

how do i share my data

|

Collectives™ on Stack Overflow

Count of different objects in a column and plotting the data

1 Answer 1

13 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

13 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related