0

I have a table like this:

col1   col2

2001   3001
2002   3001
2003   3001
2004   3002
2002   3003
2001   3003

i want to create it a query that finds out all the combination of items in col1 with a certain item in col1 if they occur to the same item in col2 and the frequency of occurrence. For example, if we want to explore item number "2001" in col1 the query should return:

col3   col4   col5
2001   2002   0.667
2001   2003   0.333

Can someone give me a hint? Thanks ahead.

3
  • 1
    This makes no sense to me, can you rephrase the question? Commented Jun 18, 2009 at 4:47
  • Is your col2 data in the first block correct? Commented Jun 18, 2009 at 4:49
  • the example data and desired results make no sense -- 2001 occurs twice in col1 in the example, the corresponding col2 values are 3001 and 3003, once each -- no relationship at all with the desired output. Commented Jun 18, 2009 at 5:15

1 Answer 1

1

Thinking of it this way, you are trying to relate the table to itself through col2, so we'll start off with a join onto itself:

select left.col1, right.col1/*, ...*/ from table left inner join table right on left.col2 = right.col2 where left.col1 = '2001' /* ... */

Now for your aggregate. You want to aggregate all right.col1 to get the count for that column:

select left.col1, right.col1, COUNT(*) from table left inner join table right on left.col2 = right.col2 where left.col1 = '2001' group by right.col2

Which I believe should come out with the raw counts. I believe you'll have to spin over the query and get a total to get the frequency.

As a note, this would be a lot more complicated if col1 weren't constant.

Edited to add: If you are looking for one query, instead of iterating over it in whatever language you are querying from, you would need to do this grouping twice:

SELECT abs_freq.col1, abs_freq.col2, abs_freq.freq / totals.total
FROM (SELECT
       left.col1 AS col1,
       COUNT(*)  AS total
      FROM TABLE LEFT
      INNER JOIN TABLE RIGHT
        ON left.col2 = right.col2
      WHERE left.col1 = '2001'
      GROUP BY left.col1
      ) totals
INNER JOIN (SELECT
              left.col1  AS col1,
              right.col1 AS col2,
              COUNT(*)   AS freq
            FROM TABLE LEFT
            INNER JOIN TABLE RIGHT
              ON left.col2 = right.col2
            WHERE left.col1 = '2001'
            GROUP BY right.col2
            ) abs_freq
   ON abs_freq.col1 = totals.col1
Sign up to request clarification or add additional context in comments.

3 Comments

Do you mind explain to me in more detail about how to get the frequency?
The result returns three columns. The third column in the query above is the number of occurrences in absolute terms, so you would have 2002 -> 2, and 2003 -> 1. To get the frequency, in whatever language you are using, get the total (2+1), then divide the column by the total (2002 -> 2/(2+1) = 0.667, 2003 -> 1/(2+1) = 0.333)
I use the following query: select C.tagid1, C.tagid2, C.partial 1.0/sum(C.partial) from (select A.tagid tagid1, B.tagid tagid2, count() partial from main A inner join main B on A.itemid = B.itemid where A.tagid != B.tagid AND A.tagid=2001 group by B.tagid) as C; however, it only return me one row due to the sum function. Do you have any idea how to fix this?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.