PostgreSQL table layout and indexing performance

Question

I built a table (4 million rows) using:

INSERT INTO
  cords ("X", "Y")
SELECT
  x, y
FROM
  generate_series(-1000,1000) x,
  generate_series(-1000,1000) y

Assuming no more rows will be inserted and if I wanted to add many columns of data relative to each row, would it be better to add the columns into the same table, or map X and Y to an int that references a separate table? Also, should I use a compound index for X (Primary Key) and Y (Secondary Key)?

I plan on querying and updating data from hundreds to thousands of rows at once very frequently. I am trying to find information on this scenario to determine what are the pros and cons of different setups.

Can someone direct me towards a relative information source or provide incite that might help direct me in the right direction? Thank you!

How many values (rows) do you expect the other table to have? The only reason I could see for keeping it in the same table (memory-wise) would be if you got close to all 4 million, and only few rows would have nulls in them. — Bergi
– Bergi, Commented Nov 21, 2019 at 0:21
There would be 1 row in the second table for each row in the first. I see what you are implying, and for each X,Y cord, many will never be updated. Maybe I should break my columns up into hard constants keeping them in the cords table, and any column that will be updated into that separate table, reducing the overall size of the database. This still leaves me wondering about what type\setup of index I should look at. — Cody DeGhetto
– Cody DeGhetto, Commented Nov 21, 2019 at 0:31
Yes, put the updateable things in a separate table. And then don't store the hard constants in your database at all. — Bergi
– Bergi, Commented Nov 21, 2019 at 0:33
The constants would all be bools stating if a property is supported for each point. I say constant as the values will never change, but are still important to each point respectively. — Cody DeGhetto
– Cody DeGhetto, Commented Nov 21, 2019 at 0:47
If the constants can be derived with a simple formula (such as "property is supported when x < 500"), don't store them but compute them dynamically. If your constants are randomly initialised but stay unchanged afterwards, I wouldn't call that a "hard constant" - sure, store it in your table. — Bergi
– Bergi, Commented Nov 21, 2019 at 0:54

Bergi · Accepted Answer · 2019-11-21 00:56:26Z

1

You shouldn't create/keep this table at all. There's no reason for 4 millions of rows with totally predictable structure to sit on your disk. If you want something like a foreign key thing where "the grid position must exist in the cords table", just drop that idea and use a CHECK constraint on the coordinate columns to fall in that range. Or even use a custom DOMAIN type.

Should I use a compound index for X (Primary Key) and Y (Secondary Key)?

Neither are keys on their own, as there are duplicates for any value. Only the combination of them is unique - so you'd use a compound primary key consisting of X and Y, and so will the index.

edited Nov 21, 2019 at 0:56

answered Nov 21, 2019 at 0:25

Bergi

671k162 gold badges1k silver badges1.5k bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Cody DeGhetto Over a year ago

So my goal is to two-fold, map data to a grid of points and allow information on any one point to effect the points that neighbor it. This information will be tracked over a long period of time and constantly be updating individual points. I am not sure I understand your reference to CHECK constraint or custom DOMAIN type. Can you point me towards something that explains these? Thank you.

Bergi Over a year ago

@CodyDeGhetto Added some links

Collectives™ on Stack Overflow

PostgreSQL table layout and indexing performance

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related