0

I have a dateframe object with date and calltime columns.

Was trying to build a histogram based on the second column. E.g. df.groupby('calltime').head(10).plot(kind='hist', y='calltime') Got the following: enter image description here The thing is that I want to get more details for the first bar. E.g. the range itself 0-2500 is huge, and all the data is hidden there... Is there a possibility to split group by smaller range? E.g. by 50, or something like that?

UPD

date calltime 0 1491928756414930 4643 1 1491928756419607 166 2 1491928756419790 120 3 1491928756419927 142 4 1491928756420083 121 5 1491928756420217 109 6 1491928756420409 52 7 1491928756420476 105 8 1491928756420605 35 9 1491928756420654 120 10 1491928756420787 105 11 1491928756420907 93 12 1491928756421013 37 13 1491928756421062 112 14 1491928756421187 41 15 1491928756421240 122 16 1491928756421375 28 17 1491928756421416 158 18 1491928756421587 65 19 1491928756421667 108 20 1491928756421790 55 21 1491928756421858 145 22 1491928756422018 37 23 1491928756422068 63 24 1491928756422145 57 25 1491928756422214 43 26 1491928756422270 73 27 1491928756422357 90 28 1491928756422460 72 29 1491928756422546 77 ... ... ... 9845 1491928759997328 670 9846 1491928759998255 372 9848 1491928759999116 659 9849 1491928759999897 369 9850 1491928760000380 746 9851 1491928760001245 823 9852 1491928760002189 634 9853 1491928760002869 335 9856 1491928760003929 4162 9865 1491928760009368 531

9
  • you can use df.hist() with bins parameter Commented Apr 11, 2017 at 20:00
  • Aha, it's already better. But can I also somehow add values to X scale, so it's visible which ranges the bar has? Commented Apr 11, 2017 at 20:06
  • Its difficult to visualize without the data, can you post the output of df.groupby('calltime').head(10)? Commented Apr 11, 2017 at 20:08
  • Added more info about the range Commented Apr 11, 2017 at 20:10
  • Ok so you can sort the data by ascending = false and take top rows to reduce the range Commented Apr 11, 2017 at 20:17

1 Answer 1

1

use bins

s = pd.Series(np.abs(np.random.randn(100)) ** 3 * 2000)
s.hist(bins=20)

enter image description here

Or you can use pd.cut to produce your own custom bins.

pd.cut(
    s, [-np.inf] + [100 * i for i in range(10)] + [np.inf]
).value_counts(sort=False).plot.bar()

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.