2

Hellow I have DataFrame like below:

df= pd.DataFrame({"target1" : [10, 0, 15, 10], "target2" : [50, 0, 20, 0], "ID" : ["1", "2", "3", "4"]})

And I need to create plot (bar or pie) which will show how % of ID have target1, how many target2, how many other (neither target1 nor target2).

I need results something like below:
T1 75% because 3 from 4 IDs have target1
T2 50% because 2 from 4 IDs have target2
other 25% because 1 from 4 do not have neither target1 nor target2

And I need have percentage description of columns and legend some if possible

enter image description here

1
  • i like your drawing! Commented Jan 7, 2021 at 8:10

2 Answers 2

1
  • set_index as 'ID' as we do not want to calculate on that column
  • Check number of items not equal to 0 (df.ne(0))
  • get mean and multiply by 100 to convert to percentage.
  • create bar plot, use rot=1, otherwise xticks would be vertical.

To annotate the bars:

  • save the plot object, loop through the patches
  • get_height (the percentage values), and format it into appropriate label, by adding '%' sign
  • get_x position and get_height, scale them by a factor marginally greater than 1, so that the labels do not intersect the bars.
>>> ax = df.set_index('ID').ne(0).mean().mul(100).plot(kind='bar', rot=1)
>>> for p in ax.patches:
        ax.annotate(str(p.get_height()) + ' %', (p.get_x() * 1.005, p.get_height() * 1.005))
>>> ax.figure

Output:

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

0

My solution

import matplotlib.pyplot as plt
import pandas as pd

df = pd.DataFrame({"target1": [10, 0, 15, 10], "target2": [50, 0, 20, 0], "ID": ["1", "2", "3", "4"]})
all = len(df["ID"])
value = [0, 0, 0]
for i in range(0, all):
    b = False
    if df["target1"][i] != 0:
        value[0] += 1
        b = True
    if df["target2"][i] != 0:
        value[1] += 1
        b = True
    if not b:
        value[2] += 1

# or you can use this
# value = [float(len([col for col in df["target1"] if col > 0])) / all * 100,
#            float(len([col for col in df["target2"] if col > 0])) / all * 100]
value = [(float(col) / all * 100) for col in value]
num_list = value
plt.bar(range(len(num_list)), num_list,
        tick_label=["target1 " + str(value[0]) + "%",
                    "target2 " + str(value[1]) + "%",
                    "default " + str(value[2]) + "%"])
plt.show()

effect

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.