0

I have got a table in Hive, in which one of the columns is string. The values in that column are like "x=1,y=2,z=3". I need to write a query that adds the value of x in this column for all the rows. How do I extract the value of x and add them?

1 Answer 1

1

you would need a UDF for this transformation:

import org.apache.hadoop.hive.ql.exec.Description;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;

class SplitColumn extends UDF {

  public Integer evaluate(Text input) {
    if(input == null) return null;
    String val=input.toString().split("=")[1];
    return Integer.parseInt(val);
  }
}

Now you can try this:

hive> ADD JAR target/hive-extensions-1.0-SNAPSHOT-jar-with-dependencies.jar;
hive> CREATE TEMPORARY FUNCTION SplitColumn as 'com.example.SplitColumn';
hive> select sum(SplitColumn(mycolumnName)) from mytable;

P.S: I have not tested this. But this should give a direction for you to proceed.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.