2

I have a model that has a column that looks like this (example):

column = [{:A => "1", :B => "2"}, [a, b, c, ..., n]]

Where a, b, c and n are also hashes, like this (example):

a = {:X => "x", :Y => "y"}

Currently, if I call a record m of the Model, for example, m.column[0][:A] it returns => "1"
And if I call m.column[1][0] I get a = {:X => "x", :Y => "y"}

So far, so good.

Now the question is, how can I get an Array of all the records of the Model that has, for example, column[0][:A] = "1" ???

I'm trying something like this (but it isn't working):

Model.where("column[0][:A] = ?", "1")

The error says:

PG::SyntaxError: ERROR:  syntax error at or near ":"

EDIT:

column is a text datatype
and in its model it has serialize :column, Array

7
  • What does the table really look like, what is its schema? Is column an array column? JSONB? Something else? Commented Nov 9, 2016 at 18:50
  • I'll edit with that information Commented Nov 9, 2016 at 18:55
  • @muistooshort you can see now Commented Nov 9, 2016 at 18:58
  • Can you change the schema so that you don't use serialize? serialize is an ugly kludge that just slops a big pile of YAML into the database so you're pretty much out of luck as parsing YAML in SQL will be highly unpleasant. Commented Nov 9, 2016 at 19:02
  • I'd love to do that, but with array datatype I couldn't find a way to build the column in a way so I can then do m.column[0][:A]. It showed an error. I need to be able to do that. @muistooshort Commented Nov 9, 2016 at 19:12

1 Answer 1

6

Your problem is that you're using serialize to store your data and serialize just slops a big pile of unqueryable YAML into the database. Parsing YAML inside PostgreSQL is certainly possible but it would be slow and a little pointless.

If you really need to store an array of hashes then PostgreSQL's jsonb type would be your best bet. You won't get symbol keys in the database but you will get string keys so this Ruby data:

[{:A => "1", :B => "2"}, {:C => 6, :D => 11}]

would look like this JSON:

[ { "A": "1", "B": "2" }, { "C": 6, "D": 11 } ]

inside the database.

Once you have jsonb in the database, you can query it using all the JSON functions and operators that PostgreSQL providers and you could even index the JSON to support faster querying. In your case, your query would look like:

Model.where("column->0->>'A' = ?", '1')

the -> operator with an integer on the RHS acts like Ruby's Array#[] method:

-> int
Get JSON array element (indexed from zero, negative integers count from the end)

With a string on the RHS it acts like Ruby's Hash#[]. Note that -> returns JSON, not text or integer. The ->> operator is the same as -> but it returns text so you'd use that at the end to make the comparison cleaner.

Alternatively you could say:

Model.where("column->0->'A' = ?", '1'.to_json)

to push the string-vs-JSON logic into Ruby.


Everything is pretty easy once your database schema makes sense (i.e. it is using PostgreSQL's jsonb instead of Rails's serialize) but how do you get there from here? First remove the serialize :column, Array from your model, then you'd need a three step migration something like this:

  1. Add a jsonb column:

    def change
      add_column :models, :column_j, :jsonb
    end
    
  2. Read each column value from the database, unpack the YAML by hand, and write the JSON by hand:

    def up
      connection.execute('select id, column from models').each do |row|
        a = YAML.load(row['column'])
        connection.raw_connection.exec(
          'update models set column_j = $1 where id = $2',
          [ a.to_json, row['id'].to_i ]
        )
      end
    end
    

    Note that you can't use the model for this as the model class and the database no longer agree about the structure and format of the models table.

  3. Replace the old column with the new column_j:

    def change
      remove_column :models, :column, :text
      rename_column :models, :column_j, :column
    end
    

You'll want to backup your database before migrating of course. And hopefully you'll never even think about using serialize again, serialize is a horrible kludge that seems like an easy solution but quickly turns into an RPG aimed right at your foot.

Sign up to request clarification or add additional context in comments.

1 Comment

Hey, thanks for your answer. I actually changed the datatype to JSON yesterday, and now I'm changing the way I collect the input data from users to store it accordingly. I'm changing nested models to that one JSON column, seems a lot cleaner.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.