Amazon Redshift: Insert data into table from S3 using Java API

Question

I currently have a file in S3. I would like to issue commands using the Java AWS SDK, to take this data and place it into a RedShift table. If the table does not exist I would like to also create the table. I have been unable to find any clear examples on how to do this so I am wondering if I am going about it the wrong way? Should I be using standard postgres java connectors instead of the AWS SDK?

Connect (docs.aws.amazon.com/redshift/latest/mgmt/…) and submit your CREATE TABLE and COPY commands — Guy
– Guy, Commented Jul 17, 2013 at 21:03
Did you manage to get this working ? do you have any blog post or anything related to how this is done ? tx — dinesh707
– dinesh707, Commented Aug 29, 2014 at 12:48
The correct way is using a jdbc driver and treating redshift as a psql database. Here is an example I posted for a ruby programmer. stackoverflow.com/questions/24438238/… — Dan Ciborowski - MSFT
– Dan Ciborowski - MSFT, Commented Sep 1, 2014 at 5:08
for my own learning, why did you decide to go from: S3 -> Redshift instead of S3 -> Kinesis -> Redshift? — Kevin Meredith
– Kevin Meredith, Commented Jan 25, 2015 at 15:27
Kinesis is not a bridge between S3 and Redshift. Kinesis is an endpoint you would stream data to... process it.. and place that process data into S3 and/or Redshift — Dan Ciborowski - MSFT
– Dan Ciborowski - MSFT, Commented Jan 25, 2015 at 19:17

Guy · Accepted Answer · 2013-07-19 10:55:28Z

10

Connect (http://docs.aws.amazon.com/redshift/latest/mgmt/connecting-in-code.html#connecting-in-code-java) and submit your CREATE TABLE and COPY commands

answered Jul 19, 2013 at 10:55

Guy

13k3 gold badges51 silver badges67 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Brian Risk Over a year ago

It seems like one of the nice utilities of stackoverflow is to rephrase the true utility/application of a documented featured and then provide a link to the documentation. It would have taken me a while to find these docs just on the AWS site.

piet.t · Accepted Answer · 2018-08-22 06:42:42Z

2

Guys answer serves most of purpose.

I would like to post a working java JDBC code that does exactly Copy from S3 to Redshift table. I hope it will help others.

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.Statement;
import java.util.Properties;

public class RedShiftJDBC {
    public static void main(String[] args) {

        Connection conn = null;
        Statement statement = null;
        try {
            //Even postgresql driver will work too. You need to make sure to choose postgresql url instead of redshift.
            //Class.forName("org.postgresql.Driver");
            //Make sure to choose appropriate Redshift Jdbc driver and its jar in classpath
            Class.forName("com.amazon.redshift.jdbc42.Driver");
            Properties props = new Properties();
            props.setProperty("user", "username***");
            props.setProperty("password", "password****");

            System.out.println("\n\nconnecting to database...\n\n");
            //In case you are using postgreSQL jdbc driver.
            //conn = DriverManager.getConnection("jdbc:postgresql://********8-your-to-redshift.redshift.amazonaws.com:5439/example-database", props);

            conn = DriverManager.getConnection("jdbc:redshift://********url-to-redshift.redshift.amazonaws.com:5439/example-database", props);

            System.out.println("\n\nConnection made!\n\n");

            statement = conn.createStatement();

            String command = "COPY my_table from 's3://path/to/csv/example.csv' CREDENTIALS 'aws_access_key_id=******;aws_secret_access_key=********' CSV DELIMITER ',' ignoreheader 1";

            System.out.println("\n\nExecuting...\n\n");

            statement.executeUpdate(command);
            //you must need to commit, if you realy want to have data saved, otherwise it will not appear if you query from other session.
            conn.commit();
            System.out.println("\n\nThats all copy using simple JDBC.\n\n");
            statement.close();
            conn.close();
        } catch (Exception ex) {
            ex.printStackTrace();
        }
    }
}

edited Aug 22, 2018 at 6:42

piet.t

11.9k21 gold badges45 silver badges56 bronze badges

answered Aug 18, 2018 at 11:19

Red Boy

5,7893 gold badges34 silver badges49 bronze badges

3 Comments

RoyalTiger Over a year ago

How did you provision the example-database db in redshift? And how did you create user/pwd ?

Red Boy Over a year ago

@RoyalTiger We have had connection allowed from one EC2 instance to Resdshift, from that instance we could connect to redshift using psql terminals using root account. Once done, everything else is just matter of finding SQL and executing.

RoyalTiger Over a year ago

Sure and I believe we need to create a db using some script before we try to connect to redshift, isn't it?

Collectives™ on Stack Overflow

Amazon Redshift: Insert data into table from S3 using Java API

2 Answers 2

1 Comment

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related