0

I'm trying to find the number of employees who are paid less than average wage.

I'm pretty new to hive and struggling a bit, could someone explain whats wrong with my statement and help me out please?

My statement -

SELECT COUNT(*) FROM(SELECT wage, AVG(wage) AS avgWage FROM emp_wages) WHERE wage < avgWage;

The error -

ParseException line 1:82 cannot recognize input near 'where' 'wage' '<' in subquery source

Any help appreciated!

4 Answers 4

2

A syntax error. Derived table should be aliased.

SELECT COUNT(*) 
FROM (SELECT wage, AVG(wage) AS avgWage FROM emp_wages group by wage)  t --alias needed here
WHERE wage < avgWage;

Query wise, it needs a change.

select count(*)
from (SELECT wage, AVG(wage) over() AS avgWage 
      FROM emp_wages
     ) t
where wage < avgWage
Sign up to request clarification or add additional context in comments.

2 Comments

Your 2nd HQL is right. But isn't your 1st HQL need group by for aggregation
You’re right..edited it to be syntactically correct.
0
SELECT COUNT(*) 
FROM (SELECT wage, AVG(wage) AS avgWage FROM emp_wages group by wage)avg --group by needed
WHERE wage < avgWage;

Comments

0

The problem is AVG is an aggregation function. If you want to map one to many relations, you need to use a cross join function:

select 
  count(*), avg(v1.wage),
  sum(case when v.wage < v2.avgwage then 1 else 0 end) below_average
from 
emp_wages v cross join (select avg(wage) as avgwage from emp_wages) as v2

Comments

0

The correct query would be:

select count(*) where wage <(select avg(wage) from emp_wages);

You are getting a parsing error as wage and avgWage is in subquery.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.