8

I am new to Hadoop, and trying to run some join queries on Hive. I created two tables (table1 and table2). I executed a Join query but getting the following error message:

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask

However, when I run this query in the Hive UI, the query gets executed, and I get correct results. Can some one help here in explaining what may be wrong?

5
  • 1
    Hive doesn't have a definitive "UI". Where are you querying from? Commented Sep 27, 2017 at 4:30
  • I am running it through Hive Editor at quickstart.cloudera:8888 Commented Sep 27, 2017 at 5:20
  • 1
    That is called Hue... So, where are you running the query when you get the error? hive command is deprecated Commented Sep 27, 2017 at 6:09
  • Yes, it's Hue. I am running the query in the Terminal. The normal SQL commands are running fine, except the Join query, after which I get this error: 'hive> select t1.Id,t1.Name,t2.Id,t2.Name from table1 t1 join table2 t2 on t1.id=t2.id; Query ID = root_20170926212222_d79b2469-efc1-49db-a2d5-e68a5e1dca87 Total jobs = 1 FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask' However, in the Hue editor, the query runs just fine. Commented Sep 27, 2017 at 6:26
  • Hue runs queries by HiveServer2. You are bypassing that using the Hive CLI. blog.cloudera.com/blog/2014/02/… Commented Sep 27, 2017 at 6:46

7 Answers 7

13

I just added the following before running my query and it worked.

SET hive.auto.convert.join=false;
Sign up to request clarification or add additional context in comments.

1 Comment

This was brilliant. It worked fine for me :) Thanks.
9

Just put this command before Query:

SET hive.auto.convert.join=false;

It definitely works!

2 Comments

Explcanation please... Not going to tweek the config only on three lines of "explanation".
Some background on the hive.auto.convert.join setting itself at docs.qubole.com/en/latest/user-guide/engines/hive/…, cwiki.apache.org/confluence/display/Hive/…, etc.
2

I was also facing the issue on Cloudera Quick Start VM - 5.12, which was resolved by executing below statement on hive prompt:

SET hive.auto.convert.join=false;

I hope that below information will be more useful:

Step-1: Importing all tables from retail_db database of MySQL

sqoop import-all-tables \
--connect jdbc:mysql://quickstart.cloudera:3306/retail_db \
--username retail_dba \
--password cloudera \
--num-mappers 1 \
--warehouse-dir /user/cloudera/sqoop/import-all-tables-text \
--as-textfile

Step-2: Creation of database called retail_db and required tables in Hive

create database retail_db;
use retail_db;

create external table categories(
  category_id int,
  category_department_id int,
  category_name string)
row format delimited 
  fields terminated by ','
stored as textfile
location '/user/cloudera/sqoop/import-all-tables-text/categories';

create external table customers(
  customer_id int,
  customer_fname string,
  customer_lname string,
  customer_email string,
  customer_password string,
  customer_street string,
  customer_city string,
  customer_state string,
  customer_zipcode string)
row format delimited 
  fields terminated by ','
stored as textfile
location '/user/cloudera/sqoop/import-all-tables-text/customers';

create external table departments(
  department_id int,
  department_name string)
row format delimited
  fields terminated by ','
stored as textfile
location '/user/cloudera/sqoop/import-all-tables-text/departments';

create external table order_items(
  order_item_id int,
  order_item_order_id int,
  order_item_product_id int,
  order_item_quantity int,
  order_item_subtotal float,
  order_item_product_price float)
row format delimited
  fields terminated by ','
stored as textfile
location '/user/cloudera/sqoop/import-all-tables-text/order_items';

create external table orders(
  order_id int,
  order_date string,
  order_customer_id int,
  order_status string)
row format delimited
  fields terminated by ','
stored as textfile
location '/user/cloudera/sqoop/import-all-tables-text/orders';

create external table products(
  product_id int,
  product_category_id int,
  product_name string,
  product_description string,
  product_price float,
  product_image string)
row format delimited
  fields terminated by ','
stored as textfile
location '/user/cloudera/sqoop/import-all-tables-text/products';

Step-3: Execute JOIN query

SET hive.cli.print.current.db=true;

select o.order_date, sum(oi.order_item_subtotal)
from orders o join order_items oi on (o.order_id = oi.order_item_order_id)
group by o.order_date 
limit 10;

Above query was giving below issue:

Query ID = cloudera_20171029182323_6eedd682-256b-466c-b2e5-58ea100715fb Total jobs = 1 FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask

Step-4: Above issue was resolved by executing below statement on HIVE prompt:

SET hive.auto.convert.join=false;

Step-5: Query Result

select o.order_date, sum(oi.order_item_subtotal)
from orders o join order_items oi on (o.order_id = oi.order_item_order_id)
group by o.order_date 
limit 10;

Query ID = cloudera_20171029182525_cfc70553-89d2-4c61-8a14-4bbeecadb3cf
Total jobs = 2
Launching Job 1 out of 2
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1509278183296_0005, Tracking URL = http://quickstart.cloudera:8088/proxy/application_1509278183296_0005/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1509278183296_0005
Hadoop job information for Stage-1: number of mappers: 2; number of reducers: 1
2017-10-29 18:25:19,861 Stage-1 map = 0%,  reduce = 0%
2017-10-29 18:25:26,181 Stage-1 map = 50%,  reduce = 0%, Cumulative CPU 2.72 sec
2017-10-29 18:25:27,240 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.42 sec
2017-10-29 18:25:32,479 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 8.01 sec
MapReduce Total cumulative CPU time: 8 seconds 10 msec
Ended Job = job_1509278183296_0005
Launching Job 2 out of 2
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1509278183296_0006, Tracking URL = http://quickstart.cloudera:8088/proxy/application_1509278183296_0006/
Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1509278183296_0006
Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1
2017-10-29 18:25:38,676 Stage-2 map = 0%,  reduce = 0%
2017-10-29 18:25:43,925 Stage-2 map = 100%,  reduce = 0%, Cumulative CPU 0.85 sec
2017-10-29 18:25:49,142 Stage-2 map = 100%,  reduce = 100%, Cumulative CPU 2.13 sec
MapReduce Total cumulative CPU time: 2 seconds 130 msec
Ended Job = job_1509278183296_0006
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 2  Reduce: 1   Cumulative CPU: 8.01 sec   HDFS Read: 8422614 HDFS Write: 17364 SUCCESS
Stage-Stage-2: Map: 1  Reduce: 1   Cumulative CPU: 2.13 sec   HDFS Read: 22571 HDFS Write: 407 SUCCESS
Total MapReduce CPU Time Spent: 10 seconds 140 msec
OK
2013-07-25 00:00:00.0   68153.83132743835
2013-07-26 00:00:00.0   136520.17266082764
2013-07-27 00:00:00.0   101074.34193611145
2013-07-28 00:00:00.0   87123.08192253113
2013-07-29 00:00:00.0   137287.09244918823
2013-07-30 00:00:00.0   102745.62186431885
2013-07-31 00:00:00.0   131878.06256484985
2013-08-01 00:00:00.0   129001.62241744995
2013-08-02 00:00:00.0   109347.00200462341
2013-08-03 00:00:00.0   95266.89186286926
Time taken: 35.721 seconds, Fetched: 10 row(s)

Comments

1

Try to set the AuthMech parameter on connection

i have set it to 2 and defined the Username

that solved my problem on ctas

Regards, Okan

Comments

0

In my case, add parameter configuration for execute will solve this problem. This problem caused by the conflict of write access. You should use the configuration to make sure you have the write access.

Comments

0

In my case it was the issue of not setting the queue, so i did following :

**set mapred.job.queue.name=**Queue-Name

this has solved my problem. Hope this will help to someone.

Comments

0

while using the hue interface faced the same issue, below is the answer Create a /user/admin in hdfs and change permissions for it by using the following commands:

[root@ip-10-0-0-163 ~]# su - hdfs

[hdfs@ip-10-0-0-163 ~]$ hadoop fs -mkdir /user/admin

[hdfs@ip-10-0-0-163 ~]$ hadoop fs -chown admin /user/admin

[hdfs@ip-10-0-0-163 ~]$ exit

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.