PostgreSQL sequence based on another column

Question

Lets say I have a table as such:

Column   |     Type    |                        Notes
---------+------------ +----------------------------------------------------------
 id      | integer     | An ID that's FK to some other table
 seq     | integer     | Each ID gets its own seq number
 data    | text        | Just some text, totally irrelevant.

id + seq is a combined key.

What I'd like to see is:

ID  | SEQ   |                        DATA
----+------ +----------------------------------------------
 1  | 1     | Quick brown fox, lorem ipsum, lazy dog, etc etc.
 1  | 2     | Quick brown fox, lorem ipsum, lazy dog, etc etc.
 1  | 3     | Quick brown fox, lorem ipsum, lazy dog, etc etc.
 1  | 4     | Quick brown fox, lorem ipsum, lazy dog, etc etc.
 2  | 1     | Quick brown fox, lorem ipsum, lazy dog, etc etc.
 3  | 1     | Quick brown fox, lorem ipsum, lazy dog, etc etc.
 3  | 2     | Quick brown fox, lorem ipsum, lazy dog, etc etc.
 3  | 3     | Quick brown fox, lorem ipsum, lazy dog, etc etc.
 3  | 4     | Quick brown fox, lorem ipsum, lazy dog, etc etc.

As you can see, a combination of id and seq is unique.

I'm not sure how to set up my table (or insert statement?) to do this. I'd like to insert id and data, resulting in seq being a sub-sequence dependent on id.

If seq reflects (or should reflect) the order in which the rows are inserted, I'd rather use a timestamp that gets populated automatically and generate a seq number on the fly when selecting the rows. — user330315
– user330315, Commented May 12, 2015 at 14:49
I agree with @joop, any deletes could make seq unreliable if it's generated on-the-fly. What problem do you want to solve with this construct? (f.ex. if your only goal is to make id, seq pairs unique, a single sequence will do that -- in fact it'll make seq unique, but that implies id, seq pairs uniqueness) — pozs
– pozs, Commented May 15, 2015 at 8:34
@fthiella, just curious, what is the practical use of such seq column? Depending on its intended use there can be different approaches. One important question here is: is it OK to have gaps in the sequence (due to deleted rows or incomplete rolled back transactions)? If gaps are not OK, then it would be expensive to recalculate the sequence if it is persisted, which means that it may be better to generate it on the fly when needed. If gaps are OK, then single global sequence (standard auto-increment column) is enough. — Vladimir Baranov
– Vladimir Baranov, Commented May 15, 2015 at 10:34

Jay Kominek · Accepted Answer · 2015-05-13 02:55:44Z

50

+100

No problem! We're going to make two tables, things and stuff. stuff will be the table you describe in your question, and things is the one it refers to:

CREATE TABLE things (
    id serial primary key,
    name text
);

CREATE TABLE stuff (
    id integer references things,
    seq integer NOT NULL,
    notes text,
    primary key (id, seq)
);

Then we'll set things up with a trigger that will create a new sequence every time a row is created:

CREATE FUNCTION make_thing_seq() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
begin
  execute format('create sequence thing_seq_%s', NEW.id);
  return NEW;
end
$$;

CREATE TRIGGER make_thing_seq AFTER INSERT ON things FOR EACH ROW EXECUTE PROCEDURE make_thing_seq();

Now we'll end up with thing_seq_1, thing_seq_2, etc, etc...

Now another trigger on stuff so that it uses the right sequence each time:

CREATE FUNCTION fill_in_stuff_seq() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
begin
  NEW.seq := nextval('thing_seq_' || NEW.id);
  RETURN NEW;
end
$$;

CREATE TRIGGER fill_in_stuff_seq BEFORE INSERT ON stuff FOR EACH ROW EXECUTE PROCEDURE fill_in_stuff_seq();

That'll ensure that when rows go into stuff, the id column is used to find the right sequence to call nextval on.

Here's a demonstration:

test=# insert into things (name) values ('Joe');
INSERT 0 1
test=# insert into things (name) values ('Bob');
INSERT 0 1
test=# select * from things;
 id | name
----+------
  1 | Joe
  2 | Bob
(2 rows)

test=# \d
              List of relations
 Schema |     Name      |   Type   |  Owner
--------+---------------+----------+----------
 public | stuff         | table    | jkominek
 public | thing_seq_1   | sequence | jkominek
 public | thing_seq_2   | sequence | jkominek
 public | things        | table    | jkominek
 public | things_id_seq | sequence | jkominek
(5 rows)

test=# insert into stuff (id, notes) values (1, 'Keychain');
INSERT 0 1
test=# insert into stuff (id, notes) values (1, 'Pet goat');
INSERT 0 1
test=# insert into stuff (id, notes) values (2, 'Family photo');
INSERT 0 1
test=# insert into stuff (id, notes) values (1, 'Redundant lawnmower');
INSERT 0 1
test=# select * from stuff;
 id | seq |        notes
----+-----+---------------------
  1 |   1 | Keychain
  1 |   2 | Pet goat
  2 |   1 | Family photo
  1 |   3 | Redundant lawnmower
(4 rows)

test=#

answered May 13, 2015 at 2:55

Jay Kominek

8,7831 gold badge37 silver badges53 bronze badges

Sign up to request clarification or add additional context in comments.

12 Comments

user330315 Over a year ago

The function make_thing_seq() will fail for the second insert of the same id value because you are not checking if such a sequence already exists.

Jay Kominek Over a year ago

Uhh it is using the id column which is a primary key, and thus unique. Trying to insert the same value of id will fail well before you get to the trigger function.

Erwin Brandstetter Over a year ago

things.id is the PK, but nothing keeps me from deleting and re-inserting the same id. UPDATE isn't covered, either. An AFTER trigger is too late in cases where you insert parent and child rows in the same statement. (Trigger on the child table runs BEFORE.) Even if it didn't, a data-modifying CTE manipulates both tables virtually at the same time. There are multiple ways how this can fail. Even while it works, sequences don't guarantee sequential numbers to begin with. Gaps in the numbering are to be expected.

ratijas Over a year ago

create sequence IF NOT EXISTS would definitely fix the possible problem of deleting and re-inserting the master ID.

Martijn Pieters Over a year ago

I much prefer Erwin's approach using ROW_NUMBER() and a view, where ROW_NUMBER() produces a virtual sequence (with no gaps). Generating a sequence for every unique group is .. a lot of extra database objects.

|

Joe Stefanelli · Accepted Answer · 2011-07-25 20:27:33Z

28

You could use a window function to assign your SEQ values, something like:

INSERT INTO YourTable
    (ID, SEQ, DATA)
    SELECT ID, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY DATA), DATA
        FROM YourSource

answered Jul 25, 2011 at 20:27

Joe Stefanelli

136k21 gold badges243 silver badges241 bronze badges

3 Comments

HLL Over a year ago

Interesting method... (Looking for something like this for ages!) what are the consequences of using partitioning/row number, Is it safe? when might it not work?

JNevill Over a year ago

Window functions are very common and you probably aren't going to run into cases "when it might not work". Check out the link in Joe's answer. Once you start using them it will open up a new world of possibilities in your SQL statements.

Andreas Baumgart Over a year ago

This should fail if you are ever going to delete any but the last record. In this case the row_number() will collide with the last record. In other words this is going to fail in a lot of very common scenarios at some point.

user330315 · Accepted Answer · 2015-05-13 05:29:23Z

7

If seq reflects (or should reflect) the order in which the rows are inserted, I'd rather use a timestamp that gets populated automatically and generate the sequence number on the fly when selecting the rows using row_number():

create table some_table
( 
  id          integer   not null,
  inserted_at timestamp not null default current_timestamp,
  data text
);

The to get the seq column, you can do:

select id,  
       row_number() over (partition by id order by inserted_at) as seq,
       data
from some_table
order by id, seq;

The select is however going to be a bit slower compared to using a persisted seq column (especially with an index on id, seq).

If that becomes a problem you can either look into using a materialized view, or adding the seq column and then updating it on a regular basis (I would not do this in a trigger for performance reasons).

SQLFiddle example: http://sqlfiddle.com/#!15/db69b/1

answered May 13, 2015 at 5:29

user330315

4 Comments

fthiella Over a year ago

maybe it's better to use a sequence instead of a timestamp? can it happen that two rows share the same timestamp? the idea is very simple but good, but if you delete a record the sequence will be calculated again, I don't know about the OP but I prefer to have gaps

user330315 Over a year ago

@fthiella: you can't make a sequence dependent on the id column, you would need one sequence for each possible value of id to achieve this (essentially what Jay is suggesting in his answer)

Jay Kominek Over a year ago

Since current_timestamp is the start of the transaction, it's identical for all the rows you insert in your sqlfiddle example (and any other single transaction). Is row_number() then just a function the order the rows are read off the disk in? Is that guaranteed to remain stable in operation / across backups and restores?

elsadek Over a year ago

You could write conditional 'nextval' in the default, which requires to read the value of the company name in the script.

Bronumski · Accepted Answer · 2015-05-12 15:56:34Z

1

Just a guess.

INSERT INTO TABLE (ID, SEQ, DATA)
VALUES
(
 IDVALUE,
 (SELECT max(SEQ) +1 FROM TABLE WHERE ID = IDVALUU),
 DATAVALUE
);

edited May 12, 2015 at 15:56

Bronumski

14.3k6 gold badges54 silver badges80 bronze badges

answered May 12, 2015 at 15:01

Abercrombieande

7196 silver badges12 bronze badges

3 Comments

user330315 Over a year ago

That's essentially Joe's answer just not as efficient

Radek Postołowicz Over a year ago

Isn't table exclusive lock needed for correct work? What if two inserts like this run concurently?

Clint Pachl Over a year ago

I think these max+1 solutions may prove to be unreliable. To test on Postgresql, I created tbl and inserted one row: id=1. I then opened two connections and started a transaction on each. I executed INSERT INTO tbl SELECT MAX(id)+1 FROM tbl. The first insert completed and the second waited for the first, as expected. I committed the first transaction and the second one immediately output: ERROR: duplicate key value violates unique constraint "tbl_pkey" DETAIL: Key (id)=(2) already exists. I committed the second transaction and it automatically rolled back.

Steve Chambers · Accepted Answer · 2015-05-14 15:14:31Z

0

Here's a simple way using standard SQL:

INSERT INTO mytable (id, seq, data)
SELECT << your desired ID >>,
       COUNT(*) + 1,
       'Quick brown fox, lorem ipsum, lazy dog, etc etc.'
FROM mytable
WHERE id = << your desired ID (same as above) >>;

See SQL Fiddle Demo.

(If you wanted to be a bit cleverer you could consider creating a trigger to update the row using the same method immediately after an insert.)

answered May 14, 2015 at 15:14

Steve Chambers

39.8k29 gold badges178 silver badges222 bronze badges

Comments

Le Droid · Accepted Answer · 2019-04-10 19:59:18Z

I had the same need to dynamicaly store a tree-like structure, not to add all IDs at once.
I prefer not use sequence table for each group as there could be thousands of them.
It run in an intensive multi-processing environment, so it has to be race-condition-proof.
Here the insert fonction for the 1st level. Other levels follow the same principle.

Each group as independent non-reusable sequencial IDs, the function receives a group name & sub-group name, gives you the existing ID or creates it & returns the new ID.
I tried a loop to have a single select, but the code is as long & harder to read.

CREATE OR REPLACE FUNCTION getOrInsert(myGroupName TEXT, mySubGroupName TEXT)
  RETURNS INT AS
$BODY$
DECLARE
   myId INT;
BEGIN -- 1st try to get it if it already exists
   SELECT id INTO myId FROM myTable
      WHERE groupName=myGroupName AND subGroupName=mySubGroupName;
   IF NOT FOUND THEN
      -- Only 1 session can get it but others can read
      LOCK TABLE myTable IN SHARE ROW EXCLUSIVE MODE; 
      -- 2nd try in case of race condition
      SELECT id INTO myId FROM myTable
         WHERE groupName=myGroupName AND subGroupName=mySubGroupName;
      IF NOT FOUND THEN -- Doesn't exist. Get next ID for this group.
         SELECT COALESCE(MAX(id), 0)+1 INTO myId FROM myTable
            WHERE groupName=myGroupName;
         INSERT INTO myTable (groupName, id, subGroupName)
            VALUES (myGroupName, myId, mySubGroupName);
      END IF;
   END IF;
   RETURN myId;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE COST 100;

To try it:

CREATE TABLE myTable (GroupName TEXT, SubGroupName TEXT, id INT);
SELECT getOrInsert('groupA', 'subgroupX'); -- Returns 1
...
SELECT * FROM myTable;
 groupname | subgroupname | id 
-----------+--------------+----
 groupA    | subgroupX    |  1
 groupA    | subgroupY    |  2
 groupA    | subgroupZ    |  3
 groupB    | subgroupY    |  1

Ryan Kinal · Accepted Answer · 2011-07-25 20:47:56Z

-3

PostgreSQL supports grouped unique columns, as such:

CREATE TABLE example (
    a integer,
    b integer,
    c integer,
    UNIQUE (a, c)
);

See PostgreSQL Documentation - Section 5.3.3

Easy :-)

answered Jul 25, 2011 at 20:47

Ryan Kinal

17.8k7 gold badges48 silver badges63 bronze badges

1 Comment

Incognito Over a year ago

The unique part isn't my main concern, it's getting the data to be input in that way as if it were a sub-sequence.

Nathan · Accepted Answer · 2011-07-25 20:49:43Z

-4

I don't have any postgresql-specific experience, but can you use a subquery in your insert statement? Something like, in Mysqlish,

INSERT INTO MYTABLE SET 
   ID=4, 
   SEQ=(  SELECT MAX(SEQ)+1 FROM MYTABLE WHERE ID=4  ),
   DATA="Quick brown fox, lorem ipsum, lazy dog, etc etc."

answered Jul 25, 2011 at 20:49

Nathan

3,9361 gold badge28 silver badges31 bronze badges

2 Comments

user330315 Over a year ago

That syntax is invalid SQL. There is no SET for insert

Yannoff Over a year ago

Actually this syntax is MySQL specific, but the main idea is here

Collectives™ on Stack Overflow

PostgreSQL sequence based on another column

8 Answers 8

12 Comments

3 Comments

4 Comments

3 Comments

Comments

Comments

1 Comment

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

12 Comments

3 Comments

4 Comments

3 Comments

Comments

Comments

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related