0

So I have the following SQL code which calculates the pearson correlation between two users rating:

 select  @u1avg:=avg(user1_rating), 
    @u2avg:=avg(user2_rating), 
    @u1sd:=stddev(user1_rating),
    @u2sd:=stddev(user2_rating)
    from
(select r1.userId as User1_id,r1.rating as User1_rating,
        r2.userId as User2_id,r2.rating as User2_rating

from mydb.ratings r1 join mydb.ratings r2 on r1.itemId = r2.itemid 
where r1.userId=1 and r2.userId=2) sample;


select (1/(count(r1.rating-1)))*sum(((r1.rating-@u1avg)/@u1sd)*((r2.rating-@u2avg)/@u2sd))*(count(r1.rating)/(1+count(r1.rating)))

from mydb.ratings r1 join mydb.ratings r2 on r1.itemId = r2.itemid 
where r1.userId=1 and r2.userId=2;

I would like to turn that into a function, for example corr(A,B) any help would be useful.

The problem I get is that where it says sample saying not allowed or something like that, however if I remove the sample I get an error saying every table must have alias.

1 Answer 1

1

I think you can do away with the derived table in the first query which will see off that particular error -

SELECT
    @u1avg:=avg(r1.rating),
    @u2avg:=avg(r2.rating),
    @u1sd:=stddev(r1.rating),
    @u2sd:=stddev(r2.rating)
FROM mydb.ratings r1
INNER JOIN mydb.ratings r2
    ON r1.itemId = r2.itemId
WHERE r1.userId=1
AND r2.userId=2;

SELECT (1/(COUNT(r1.rating-1)))*SUM(((r1.rating-@u1avg)/@u1sd)*((r2.rating-@u2avg)/@u2sd))*(COUNT(r1.rating)/(1+COUNT(r1.rating)))
FROM mydb.ratings r1
INNER JOIN mydb.ratings r2
    ON r1.itemId = r2.itemid
WHERE r1.userId=1
AND r2.userId=2;
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.