2

I am using SQL Server 2012 and need to generate a histogram, conceptually similar to Google's screener

The idea is to split all the prices into 100 equally sized (based on price) buckets, and then each bucket contains a number of items priced within the bucket's min and max. NTILE didn't work -- it tried to split items equally (based on count) among buckets.

So, this is what I have so far:

select bucket, count(*) from (select cast((PERCENT_RANK() OVER(ORDER BY Price DESC)) *   100 as int) as bucket  from MyTable
where DataDate = '4/26/2012') t group by bucket

Is this a good way to produce a histogram in SQL Server 2012? Is there anything built-in SQL Server 2012 to do this task or a better way?

Thank you

3
  • 1
    "..split all the prices into 100 equally sized (based on price) buckets.." To do this you need to determine (or need some rule for determining) what the range of prices to be covered by these 100 buckets will be. Commented Apr 28, 2013 at 22:29
  • It's my understanding that PERCENT_RANK gives us a percentage placement of a value in a set. In this case the value is the price and the set is all records on 4/26/2012. Multiplying by 100 and casting to int effectively places each value in one of 100 buckets. Commented Apr 29, 2013 at 3:31
  • 1
    That just does the same thing as NTILE only in reverse. IF you want the 100 price sub-ranges to be of equal width, then you must start with a range/rule to determine their total width and divide that by 100. That's simple math. We need you to decide what that range/rule is. Commented Apr 29, 2013 at 9:58

1 Answer 1

3

Like this perhaps:

with cte as (
  select base = 1 + u + t*3 from (
    select 0 as u union all select 1 union all select 2
  ) T1
  cross join (
    select 0 as t union all select 1 union all select 2
  ) T2
), data as (
  select * 
  from ( 
   values (1,1,2,3,3,5,7,4,2,1)
  ) data(x0,x1,x2,x3,x4,x5,x6,x7,x8,x9)
)
select cte.base
  ,case when x0>=base then 'X' else  ' ' end as x0
  ,case when x1>=base then 'X' else  ' ' end as x1
  ,case when x2>=base then 'X' else  ' ' end as x2
  ,case when x3>=base then 'X' else  ' ' end as x3
  ,case when x4>=base then 'X' else  ' ' end as x4
  ,case when x5>=base then 'X' else  ' ' end as x5
  ,case when x6>=base then 'X' else  ' ' end as x6
  ,case when x7>=base then 'X' else  ' ' end as x7
  ,case when x8>=base then 'X' else  ' ' end as x8
  ,case when x9>=base then 'X' else  ' ' end as x9
from cte
cross join data
order by base desc
;

which yields this histogram nicely:

base        x0   x1   x2   x3   x4   x5   x6   x7   x8   x9
----------- ---- ---- ---- ---- ---- ---- ---- ---- ---- ----
9                                                         
8                                                         
7                                         X               
6                                         X               
5                                    X    X               
4                                    X    X    X          
3                          X    X    X    X    X          
2                     X    X    X    X    X    X    X     
1           X    X    X    X    X    X    X    X    X    X

Remember to pivot your data into a single row first.

For a more compact presentation, concatenate the various data columns into a single long string.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.