10
  • I have a .csv table (t1) with columns: c1, c2, c3 in amazon S3 storage
  • I want to copy that into amazon redshift
  • I create the table with columns: c1, c2, c3 where all columns are nullable
  • I copy with command:

    copy t1a (c1,c3) from t1

  • I expected it would copy c1 and c3 over from t1 and place the default null value in c2 so a row in t1a might look like (c1_rowX, null, c3_rowX).

  • Instead I get a type error because it's coping c2 (string type) data from t1 into c3 (int type) of t1a.

  • the copy command works fine when I don't specify the columns:

    copy t1a from t1

  • I've included a link to the redshift copy command documentation:

http://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html

  • main question is there an issue with my use of specifying columns. thanks

2 Answers 2

14

If you want to skip preprocessing part, you can define the column to be skipped as CHAR(1) and then use a TRUNCATECOLUMNS parameter to a COPY command:

CREATE TABLE t1a (
  c1,
  c2 CHAR(1),
  c3
);
COPY t1a FROM t1 TRUNCATECOLUMNS

The TRUNCATECOLUMNS ignores all data that is longer than defined in a table schema during the import, so all data in that column will be truncated into 1 character.

That's just a hack, preprocessing input file is recommended, but sometimes a hack is all that's needed.

Sign up to request clarification or add additional context in comments.

Comments

5

Detailed column-mappings are not supported in COPY directly, unless Amazon/ParAccel have added something extra to their PostgreSQL fork that isn't in mainline.

What you typically do in this case is have a script massage/pad the data - read the CSV, transform it as desired, then send the line through the PostgreSQL connection to a COPY ... FROM STDIN command.

Alternately, you might COPY the whole original CSV into a table then transform that data with an INSERT INTO ... SELECT to the real target table.

3 Comments

thanks, Craig. Hoping to avoid copying whole file due to size and time issues. Might transform before copy.
amazon provides copy script template: COPY table_name [ (column1 [,column2, ...]) ] -- where (column1 [, column2, ...]) "specifies an optional column list to load data fields into specific columns"
@Elm Sure, but that doesn't let you say COPY tablename(col1, SKIP, col3) does it ... and that's what it seems you want? Even if Pg supported that it'd still have to transfer all the data to discard the parts you didn't want, so it makes a lot more sense to just do this client-side with a pre-processing script.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.