4
    create table MY_DATA0(session_id STRING, userid BIGINT,date_time STRING, ip STRING, URL STRING ,country STRING, state STRING, city STRING) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES 
TERMINATED BY '\n' STORED AS TEXTFILE ;

    LOAD DATA INPATH '/inputhive' OVERWRITE INTO TABLE MY_DATA0;

    create table part0(session_id STRING, userid BIGINT,date_time STRING, ip STRING, URL STRING) partitioned by (country STRING, state STRING, city STRING) 

    clustered by (userid) into 256 buckets ROW FORMAT DELIMITED FIELDS 
    TERMINATED BY ',' LINES TERMINATED BY '\n' STORED AS TEXTFILE ;

    \insert overwrite table part0 partition(country, state, city) select session_id, userid, date_time,ip, url, country, state,city from my_data0;

Overview of my dataset:

{60A191CB-B3CA-496E-B33B-0ACA551DD503},1331582487,2012-03-12 13:01:27,66.91.193.75,http://www.acme.com/SH55126545/VD55179433,United States,Hauula,Hawaii

{365CC356-7822-8A42-51D2-B6396F8FC5BF},1331584835,2012-03-12 13:40:35,173.172.214.24,http://www.acme.com/SH55126545/VD55179433,United States,El Paso,Texas

When I run the last insert script I get an error as :

java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveFatalException: [Error 20004]: Fatal error occurred when node tried to create too many dynamic partitions. The maximum number of dynamic partitions is controlled by hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode. Maximum was set to: 100

PS:

I have set this two properties:

hive.exec.dynamic.partition.mode::nonstrict

hive.enforce.bucketing::true

1
  • count the distinct values for the partitioned columns, and set the hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode more than the distinct count.In your case city may have more distinct values , which may be more than 100, and set the above params to higher values than the distinct count of the partitioned columns Commented Apr 19, 2017 at 12:51

3 Answers 3

15

Try setting those properties to higher values.

SET hive.exec.max.dynamic.partitions=100000;
SET hive.exec.max.dynamic.partitions.pernode=100000;
Sign up to request clarification or add additional context in comments.

1 Comment

Hi Sir, Please check the post and reply
3

Partition columns should be mentioned at last in select statement. Ex: if state is the partition column, then "insert into table t1 partition(state) select Id, name, dept, sal, state from t2"; this will work. For instance if my query is like this "insert into table t1 partition(state) select Id, name, dept,state, sal from t2;" then partitions will be created with salary(sal) column

1 Comment

Thank you. I faced same problem. Your suggestion worked for me.
0

It may be because your query is picking the wrong (or HIGH CARDINALITY) column (as that is placed at last when you do select * from table2). To specify use inset into table table1 partition(partition_column) select column_name1, column_name2, partition_column (keep the partitioned column at last) from table2; Refer to images earlier I was using select * from the table so was getting 1200 partitions but manually placing state column at last I was able to get only 38 partitions

enter image description here

enter image description here

enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.