1

I am wondering if there is any way to create indexes based on some type conversion of xpath results. Specifically an integer index, but I can also imagine date, floating points, etc. Even if it is experimental or coming in a future version of postgres... or another database. I already looked at eXistdb which looked far from being production ready.

Some test XML:

<?xml version="1.0"?>
<book>
    <isbn>1</isbn>
    <title>Goldfinger</title>
    <author>Ian Fleming</author>
</book>

I was able to obtain very satisfactory results with a table and index like this.

                                             Table "public.test"
 Column |  Type   |                     Modifiers                     | Storage  | Stats target | Description
--------+---------+---------------------------------------------------+----------+--------------+-------------
 id     | integer | not null default nextval('test_id_seq'::regclass) | plain    |              |
 num    | integer |                                                   | plain    |              |
 data   | xml     |                                                   | extended |              |
Indexes:
    "test_pkey" PRIMARY KEY, btree (id)
    "test_title_index" btree (((xpath('/book/title/text()'::text, data))[1]::text))

for a query such as:

SELECT *
FROM test
WHERE (xpath('/book/title/text()', data))[1]::text = 'Goldfinger';

But there is data in the schemas where a non-text index would make a great deal more sense. For example (I know this is not valid, but it illustrates the point):

SELECT *
FROM test
WHERE (xpath('/book/isbn/text()', data))[1]::int BETWEEN 5 AND 10;

A little background:

I am experimenting with storing XML documents in postgres as I have an application where the primary data types are already in XML and often need to be retrieved as such. The schemas can be very complex, so splitting them into database columns is extremely time consuming, especially as the schemas evolve. I only mention this because I suspect a logical reaction to my question is going to be "break the data out into native columns".

1 Answer 1

2

You can create indexes on expressions for anything you want to. There's no cast directly from xml to integer, but you can go through text. The DB also complained quite a bit about syntax creating the index until I went a little nuts with parentheses.

In the end, I got this to work:

CREATE TABLE books (id integer, data xml)
CREATE INDEX books_isbn ON BOOKS (((((xpath('/book/isbn/text()', data))[1])::text)::integer));
Sign up to request clarification or add additional context in comments.

1 Comment

That's great. My queries look equally mental, but it does work. Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.