9

I am trying to create a histogram data using following query:

SELECT FLOOR(Max_Irrad/10) AS bucket, COUNT(*) AS COUNT
FROM marctest.test_summarynimish
where Lcu_name='Allegro'
  and Lcu_Mode='Standard'
GROUP  BY bucket;

following is the result that i am getting:

bucket count
 0        3
 4        3
 5       12
 7        6
 8        3
10        3

now the bucket field is the range or bin used in the histogram. I want to create a bucket values with consistent range, for eg starting from 0,4,8,12.... and so on.. Is there any way to achieve this in mysql? This is how I am expecting to have as result:

 bucket count
 0        3
 4       21
 8        6
6
  • I have provided the result of the query above.. do you need anything extra apart from that? Commented Feb 11, 2012 at 8:17
  • It would help to know what the expected result is (the same way as you displayed the result of your query, AND using that very same sample data, in order to understand the logic behind) Commented Feb 11, 2012 at 8:24
  • i m trying to post the expected result in tabulated form but I am unable to do it. I apologize as I am new to the forum Commented Feb 11, 2012 at 8:29
  • Shouldn't it be (0,3);(4,21);(8,6)? From 0,1,2,3 - 4,5,6,7 - 8,9,10,11? Commented Feb 11, 2012 at 8:39
  • yes that is how I want. Systematic distribution of the buckets or bin Commented Feb 11, 2012 at 9:27

2 Answers 2

8

I think we can use the following general form to create a general histogram:

select (x div 4) * 4 as NewX, count(*) as NewY from histogram
group by NewX

Where x is the real x value of the x axis and count(*) is the real y value. The number 4 is the size amount of the x values we want to group. This means we will group all x values in groups of 4 (e.g.: group 1 is 0, 1, 2, 3; group 2 is 4, 5, 6, 7, and so on). The count of each item in the group will become the NewY value

You can play with this here

Applying this logic to your query this would be:

select (floor(Max_Irrad/10) div 4) * 4 as NewX, count(*) as NewY
from marctest.test_summarynimish
where Lcu_name='Allegro' and Lcu_Mode='Standard'
group by NewX

Let me know if you have any trouble or doubt about this.

Sign up to request clarification or add additional context in comments.

4 Comments

The solution you provide is working thanks a lot.... :) that would be really very helpful
Above method is correct except it will not report buckets having zero samples.
Also, if you're bin size is a non-integer it won't work. You'll need instead floor(value/binSize)*binSize for non-integer bin sizes.
@LionHeart That makes sense. If there are no rows that contain data then it is not possible to display those missing rows, right? :) If you want to know how to generate data in MySQL I've written a long explanation with a couple of alternatives in this other question. Anyway, the missing data can be easily and more efficiently zero-filled when formatting the results in the histogram.
0

Just make your buckets bigger by dividing Max_Irrad by 40 instead of 10.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.