2

The data I want to store data that has this characteristics:

  • There are a finite number of fields (I don't expect to add new fields);
  • There are some columns that are common to all sets of data (a category field, for instance);
  • There are some columns that are specific to individual sets of data (each category needs it's own fields);

Here's how it would look like in a regular table:

Table

I'm having trouble figuring out which would be the better way to store this data in a database for this situation.

Bellow are the ideas I already had:

  • Do exactly as the tabular table (I would have many NULL values);
  • Divide the categories into tables (I would use joins when needed);
  • Use JSON type for storing the values (no NULL values and having it all in same table).

So my questions are:

  1. Is one of these solutions (or one that I have not thought about it) that is better for this case?
  2. Are there other factors, other than the ones presented here, that I should consider to make this decision?

3 Answers 3

2

Unless you have very many columns (~ 100), it is usually better to use normal columns. NULL values don't take any storage space in PostgreSQL.

On the other hand, if you have queries that can use any of these columns in the WHERE condition, and you compare with =, a single GIN index on a jsonb might be better than having many B-tree indexes, because the index maintenance costs would be higher.

The definitive answer depends on the SQL statements that you plan to run on that table.

Sign up to request clarification or add additional context in comments.

3 Comments

I did not understand why json would be better if I had to use WHERE conditions. Intuitively, I would think that these conditions work better with regular columns. Could you explain it?
And what's the standard for "many columns"? 10, 100, 1000... ? I believe there's a part which is relative to the application of it, but do you think there's a limit (some rule of thumb) we should respect?
I have added an explanation and an estimate of what would be "many" - but that is just my personal gut feeling.
1

You have laid out the three options pretty well. Things to consider are:

  • Performance
  • Data size
  • Each of maintenance
  • Flexibility
  • Security

Note that you don't even allude to security considerations. But security at the table level is usually a tad simpler than at the column level and might be important for regulated data such as PII (personally identifiable information).

The primary strength of the JSON solution is flexibility. It is easy to add new columns. But you don't need that. JSON has a cost in data size and data type flexibility (notably JSON doesn't support date/times explicitly).

A multiple table solution requires duplicating the primary key but may result in much less storage overall if the columns really are sparse. The "may" may also depend on the data type. A NULL string for instance occupies less space than a NULL float in a table record.

The joins on multiple tables will be 1-1 on primary keys. These should be pretty fast.

What would I do? Unless the answer is obvious, I would dump the data into a single table with a bunch of columns. If that table starts to get unwieldy, then I would think about splitting it into separate tables -- but still have one table for the common columns. The details of one or multiple tables can be hidden behind a view.

Comments

0

Depends on how much data you want to store, but as long as it is finite it shouldn't make a big difference if it contains a lot of null's or not

2 Comments

The number of columns is finite, but the number of rows is not. It can become considerably big
Oh sry thats my fault, you are right, but also with that knowledge i don't think it would change a lot, in a lot of databases a null won't take away any storage so i guess its ok.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.