How do import simple jsonl file into Postgres 10 database

Question

I am using Postgres 10 and have table like this

             Table "musicbrainz.acoustid_meta"

    Column    |       Type        | Collation | Nullable | Default
--------------+-------------------+-----------+----------+---------
 id           | integer           |           | not null |
 track        | character varying |           |          |
 artist       | character varying |           |          |
 album        | character varying |           |          |
 album_artist | character varying |           |          |
 track_no     | character varying |           |          |
 disc_no      | character varying |           |          |
 year         | character varying |           |          |
Indexes:
    "acoustid_meta_index" btree (id)

and I used to have csv files such as

id,track,artist,album,album_artist,track_no,disc_no,year
23033007,Satellite,Dave Matthews Band,Under the Table & Dreaming,Dave Matthews Band,3,\N,1994

that I imported with

psql jthinksearch -c "copy musicbrainz.acoustid_meta from '/home/ubuntu/code/acoustid-server/meta.full.$LATEST.csv' DELIMITER ',' CSV HEADER";
p

But now the files are jsonl files, with each line this this

{"id":339058430,"track":"Track14","artist":"Unknown Artist","album":"Unknown Title","album_artist":"Unknown Artist","track_no":14,"disc_no":null,"year":null}

How do I import these files safetly, I have tried using sed as a workaround to convert the file to csv but not quite right

cat $LATEST-meta-update.jsonl|sed 
-e 's/{"id"://' 
-e 's/"track"://' 
-e 's/"artist"://' 
-e 's/"album"://' 
-e 's/"album_artist"://' 
-e 's/"track_no"://' 
-e 's/"disc_no"://' 
-e 's/"year"://' 
-e 's/\\\\"//g'  >meta.csv

Also I have 5 different tables to import, so will have to construct sed for each.

Update Just realized the purpose of the end column that I ignored for simplicity

If is a new record to be added to table will have

"created":"2020-02-01T00:00:13.225963+00:00"

but if the records needs to replace existing record will have

"updated":"2020-02-03T13:20:12.988533+00:00"

When in do the insert using cross join populate_from_json how do I use a where clause to restrict to only use the ones with the created field ?

What is "jsonl"? One separate json object per line or something? I've never come across that before (not as a formal format) so I don't think you'll find a standard tool to handle it. Obviously you can throw together a simple python script to either parse each line or just split the lines out and expand them in PostgreSQL itself. — Richard Huxton
– Richard Huxton, Commented Sep 20, 2021 at 13:10
i think so each represents a line in a table, I dont know python I was hoping that postgres could handle simple json itself, no idea why it has been changed from csv to jsonl — Paul Taylor
– Paul Taylor, Commented Sep 20, 2021 at 13:29

klin · Accepted Answer · 2021-09-20 17:08:50Z

3

If you have to do that only in Postgres create auxiliary schema with go-between tables like this:

create schema jsons;
create table jsons.acoustid_meta(data jsonb);

Copy the file to the go-between table:

copy jsons.acoustid_meta from ...

And parse jsons with the Postgres script:

insert into musicbrainz.acoustid_meta
select id, track, artist, album, album_artist, track_no, disc_no, year
from jsons.acoustid_meta 
cross join jsonb_populate_record(null::musicbrainz.acoustid_meta, data);
truncate jsons.acoustid_meta;

Update. You can examine json values by referring to data, example:

insert into musicbrainz.acoustid_meta
select id, track, artist, album, album_artist, track_no, disc_no, year
from jsons.acoustid_meta 
cross join jsonb_populate_record(null::musicbrainz.acoustid_meta, data)
where data->'created' is not null;

edited Sep 20, 2021 at 17:08

answered Sep 20, 2021 at 13:46

klin

123k15 gold badges240 silver badges262 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Richard Huxton Over a year ago

But it doesn't look like the file IS json. It's line-separated separate json objects rather than being one json array of objects. However - you could do it in two steps. Read the file in as a single-column text table, then cast each value in that table to jsonb and then expand that.

klin Over a year ago

Yes, but this is ok. It will be imported to the separate rows in a table in the form as is. No casts needed.

Paul Taylor Over a year ago

Thanks yes I dont quite understand the syntax but this does seem to work, I think this is probably the solution.

Richard Huxton Over a year ago

@klin I was assuming that it would try to parse it as a single json value. Thinking about it though, of course it will accept everything up to the newline as json, treat that as a row and then COPY will see the newline, start a new row... etc. cool.

Paul Taylor Over a year ago

@klin i have supplementary question regarding filter insert with a where clause, if you could take a look at the updated question would appreciate it.

|

Collectives™ on Stack Overflow

How do import simple jsonl file into Postgres 10 database

1 Answer 1

6 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related