71

Lets say I have a table as such:

Column   |     Type    |                        Notes
---------+------------ +----------------------------------------------------------
 id      | integer     | An ID that's FK to some other table
 seq     | integer     | Each ID gets its own seq number
 data    | text        | Just some text, totally irrelevant.

id + seq is a combined key.

What I'd like to see is:

ID  | SEQ   |                        DATA
----+------ +----------------------------------------------
 1  | 1     | Quick brown fox, lorem ipsum, lazy dog, etc etc.
 1  | 2     | Quick brown fox, lorem ipsum, lazy dog, etc etc.
 1  | 3     | Quick brown fox, lorem ipsum, lazy dog, etc etc.
 1  | 4     | Quick brown fox, lorem ipsum, lazy dog, etc etc.
 2  | 1     | Quick brown fox, lorem ipsum, lazy dog, etc etc.
 3  | 1     | Quick brown fox, lorem ipsum, lazy dog, etc etc.
 3  | 2     | Quick brown fox, lorem ipsum, lazy dog, etc etc.
 3  | 3     | Quick brown fox, lorem ipsum, lazy dog, etc etc.
 3  | 4     | Quick brown fox, lorem ipsum, lazy dog, etc etc.

As you can see, a combination of id and seq is unique.

I'm not sure how to set up my table (or insert statement?) to do this. I'd like to insert id and data, resulting in seq being a sub-sequence dependent on id.

11
  • 4
    If seq reflects (or should reflect) the order in which the rows are inserted, I'd rather use a timestamp that gets populated automatically and generate a seq number on the fly when selecting the rows. Commented May 12, 2015 at 14:49
  • 2
    What should happen when a record is deleted? (or updated) Commented May 12, 2015 at 16:08
  • 2
    I agree with @joop, any deletes could make seq unreliable if it's generated on-the-fly. What problem do you want to solve with this construct? (f.ex. if your only goal is to make id, seq pairs unique, a single sequence will do that -- in fact it'll make seq unique, but that implies id, seq pairs uniqueness) Commented May 15, 2015 at 8:34
  • 2
    @fthiella, just curious, what is the practical use of such seq column? Depending on its intended use there can be different approaches. One important question here is: is it OK to have gaps in the sequence (due to deleted rows or incomplete rolled back transactions)? If gaps are not OK, then it would be expensive to recalculate the sequence if it is persisted, which means that it may be better to generate it on the fly when needed. If gaps are OK, then single global sequence (standard auto-increment column) is enough. Commented May 15, 2015 at 10:34
  • 5
    @fthiella: A plain serial is the way to go. Commented May 17, 2015 at 20:52

8 Answers 8

50
+100

No problem! We're going to make two tables, things and stuff. stuff will be the table you describe in your question, and things is the one it refers to:

CREATE TABLE things (
    id serial primary key,
    name text
);

CREATE TABLE stuff (
    id integer references things,
    seq integer NOT NULL,
    notes text,
    primary key (id, seq)
);

Then we'll set things up with a trigger that will create a new sequence every time a row is created:

CREATE FUNCTION make_thing_seq() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
begin
  execute format('create sequence thing_seq_%s', NEW.id);
  return NEW;
end
$$;

CREATE TRIGGER make_thing_seq AFTER INSERT ON things FOR EACH ROW EXECUTE PROCEDURE make_thing_seq();

Now we'll end up with thing_seq_1, thing_seq_2, etc, etc...

Now another trigger on stuff so that it uses the right sequence each time:

CREATE FUNCTION fill_in_stuff_seq() RETURNS trigger
    LANGUAGE plpgsql
    AS $$
begin
  NEW.seq := nextval('thing_seq_' || NEW.id);
  RETURN NEW;
end
$$;

CREATE TRIGGER fill_in_stuff_seq BEFORE INSERT ON stuff FOR EACH ROW EXECUTE PROCEDURE fill_in_stuff_seq();

That'll ensure that when rows go into stuff, the id column is used to find the right sequence to call nextval on.

Here's a demonstration:

test=# insert into things (name) values ('Joe');
INSERT 0 1
test=# insert into things (name) values ('Bob');
INSERT 0 1
test=# select * from things;
 id | name
----+------
  1 | Joe
  2 | Bob
(2 rows)

test=# \d
              List of relations
 Schema |     Name      |   Type   |  Owner
--------+---------------+----------+----------
 public | stuff         | table    | jkominek
 public | thing_seq_1   | sequence | jkominek
 public | thing_seq_2   | sequence | jkominek
 public | things        | table    | jkominek
 public | things_id_seq | sequence | jkominek
(5 rows)

test=# insert into stuff (id, notes) values (1, 'Keychain');
INSERT 0 1
test=# insert into stuff (id, notes) values (1, 'Pet goat');
INSERT 0 1
test=# insert into stuff (id, notes) values (2, 'Family photo');
INSERT 0 1
test=# insert into stuff (id, notes) values (1, 'Redundant lawnmower');
INSERT 0 1
test=# select * from stuff;
 id | seq |        notes
----+-----+---------------------
  1 |   1 | Keychain
  1 |   2 | Pet goat
  2 |   1 | Family photo
  1 |   3 | Redundant lawnmower
(4 rows)

test=#
Sign up to request clarification or add additional context in comments.

12 Comments

The function make_thing_seq() will fail for the second insert of the same id value because you are not checking if such a sequence already exists.
Uhh it is using the id column which is a primary key, and thus unique. Trying to insert the same value of id will fail well before you get to the trigger function.
things.id is the PK, but nothing keeps me from deleting and re-inserting the same id. UPDATE isn't covered, either. An AFTER trigger is too late in cases where you insert parent and child rows in the same statement. (Trigger on the child table runs BEFORE.) Even if it didn't, a data-modifying CTE manipulates both tables virtually at the same time. There are multiple ways how this can fail. Even while it works, sequences don't guarantee sequential numbers to begin with. Gaps in the numbering are to be expected.
create sequence IF NOT EXISTS would definitely fix the possible problem of deleting and re-inserting the master ID.
I much prefer Erwin's approach using ROW_NUMBER() and a view, where ROW_NUMBER() produces a virtual sequence (with no gaps). Generating a sequence for every unique group is .. a lot of extra database objects.
|
28

You could use a window function to assign your SEQ values, something like:

INSERT INTO YourTable
    (ID, SEQ, DATA)
    SELECT ID, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY DATA), DATA
        FROM YourSource

3 Comments

Interesting method... (Looking for something like this for ages!) what are the consequences of using partitioning/row number, Is it safe? when might it not work?
Window functions are very common and you probably aren't going to run into cases "when it might not work". Check out the link in Joe's answer. Once you start using them it will open up a new world of possibilities in your SQL statements.
This should fail if you are ever going to delete any but the last record. In this case the row_number() will collide with the last record. In other words this is going to fail in a lot of very common scenarios at some point.
7

If seq reflects (or should reflect) the order in which the rows are inserted, I'd rather use a timestamp that gets populated automatically and generate the sequence number on the fly when selecting the rows using row_number():

create table some_table
( 
  id          integer   not null,
  inserted_at timestamp not null default current_timestamp,
  data text
);

The to get the seq column, you can do:

select id,  
       row_number() over (partition by id order by inserted_at) as seq,
       data
from some_table
order by id, seq;

The select is however going to be a bit slower compared to using a persisted seq column (especially with an index on id, seq).

If that becomes a problem you can either look into using a materialized view, or adding the seq column and then updating it on a regular basis (I would not do this in a trigger for performance reasons).

SQLFiddle example: http://sqlfiddle.com/#!15/db69b/1

4 Comments

maybe it's better to use a sequence instead of a timestamp? can it happen that two rows share the same timestamp? the idea is very simple but good, but if you delete a record the sequence will be calculated again, I don't know about the OP but I prefer to have gaps
@fthiella: you can't make a sequence dependent on the id column, you would need one sequence for each possible value of id to achieve this (essentially what Jay is suggesting in his answer)
Since current_timestamp is the start of the transaction, it's identical for all the rows you insert in your sqlfiddle example (and any other single transaction). Is row_number() then just a function the order the rows are read off the disk in? Is that guaranteed to remain stable in operation / across backups and restores?
You could write conditional 'nextval' in the default, which requires to read the value of the company name in the script.
1

Just a guess.

INSERT INTO TABLE (ID, SEQ, DATA)
VALUES
(
 IDVALUE,
 (SELECT max(SEQ) +1 FROM TABLE WHERE ID = IDVALUU),
 DATAVALUE
);

3 Comments

That's essentially Joe's answer just not as efficient
Isn't table exclusive lock needed for correct work? What if two inserts like this run concurently?
I think these max+1 solutions may prove to be unreliable. To test on Postgresql, I created tbl and inserted one row: id=1. I then opened two connections and started a transaction on each. I executed INSERT INTO tbl SELECT MAX(id)+1 FROM tbl. The first insert completed and the second waited for the first, as expected. I committed the first transaction and the second one immediately output: ERROR: duplicate key value violates unique constraint "tbl_pkey" DETAIL: Key (id)=(2) already exists. I committed the second transaction and it automatically rolled back.
0

Here's a simple way using standard SQL:

INSERT INTO mytable (id, seq, data)
SELECT << your desired ID >>,
       COUNT(*) + 1,
       'Quick brown fox, lorem ipsum, lazy dog, etc etc.'
FROM mytable
WHERE id = << your desired ID (same as above) >>;

See SQL Fiddle Demo.

(If you wanted to be a bit cleverer you could consider creating a trigger to update the row using the same method immediately after an insert.)

Comments

0

I had the same need to dynamicaly store a tree-like structure, not to add all IDs at once.
I prefer not use sequence table for each group as there could be thousands of them.
It run in an intensive multi-processing environment, so it has to be race-condition-proof.
Here the insert fonction for the 1st level. Other levels follow the same principle.

Each group as independent non-reusable sequencial IDs, the function receives a group name & sub-group name, gives you the existing ID or creates it & returns the new ID.
I tried a loop to have a single select, but the code is as long & harder to read.

CREATE OR REPLACE FUNCTION getOrInsert(myGroupName TEXT, mySubGroupName TEXT)
  RETURNS INT AS
$BODY$
DECLARE
   myId INT;
BEGIN -- 1st try to get it if it already exists
   SELECT id INTO myId FROM myTable
      WHERE groupName=myGroupName AND subGroupName=mySubGroupName;
   IF NOT FOUND THEN
      -- Only 1 session can get it but others can read
      LOCK TABLE myTable IN SHARE ROW EXCLUSIVE MODE; 
      -- 2nd try in case of race condition
      SELECT id INTO myId FROM myTable
         WHERE groupName=myGroupName AND subGroupName=mySubGroupName;
      IF NOT FOUND THEN -- Doesn't exist. Get next ID for this group.
         SELECT COALESCE(MAX(id), 0)+1 INTO myId FROM myTable
            WHERE groupName=myGroupName;
         INSERT INTO myTable (groupName, id, subGroupName)
            VALUES (myGroupName, myId, mySubGroupName);
      END IF;
   END IF;
   RETURN myId;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE COST 100;

To try it:

CREATE TABLE myTable (GroupName TEXT, SubGroupName TEXT, id INT);
SELECT getOrInsert('groupA', 'subgroupX'); -- Returns 1
...
SELECT * FROM myTable;
 groupname | subgroupname | id 
-----------+--------------+----
 groupA    | subgroupX    |  1
 groupA    | subgroupY    |  2
 groupA    | subgroupZ    |  3
 groupB    | subgroupY    |  1

Comments

-3

PostgreSQL supports grouped unique columns, as such:

CREATE TABLE example (
    a integer,
    b integer,
    c integer,
    UNIQUE (a, c)
);

See PostgreSQL Documentation - Section 5.3.3

Easy :-)

1 Comment

The unique part isn't my main concern, it's getting the data to be input in that way as if it were a sub-sequence.
-4

I don't have any postgresql-specific experience, but can you use a subquery in your insert statement? Something like, in Mysqlish,

INSERT INTO MYTABLE SET 
   ID=4, 
   SEQ=(  SELECT MAX(SEQ)+1 FROM MYTABLE WHERE ID=4  ),
   DATA="Quick brown fox, lorem ipsum, lazy dog, etc etc."

2 Comments

That syntax is invalid SQL. There is no SET for insert
Actually this syntax is MySQL specific, but the main idea is here

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.