0

I'm trying to write a lexical database for storing words comprised of roots and patterns, and I was wondering how I could create a column that will combine the root and pattern for me, while ignoring rows which don't have both columns of the SELECT query populated.

Basically, I have this output from a PostgreSQL DB:

SELECT root, root_i FROM tbl_roots NATURAL JOIN tbl_patterns NATURAL JOIN tbl_patterns_triliteral;

  root   | root_i
---------+--------
 {s,ş,m} | 1u2u3a
 {p,l,t} | 1u2u3a
 {t,m,s} | 1u2u3a
 {n,t,l} | 1u2u3a
 {s,ş,m} | 1a2oi3
 {p,l,t} | 1a2oi3
 {t,m,s} | 1a2oi3
 {n,t,l} | 1a2oi3
 {s,ş,m} | 1o2i3
 {p,l,t} | 1o2i3
 {t,m,s} | 1o2i3
 {n,t,l} | 1o2i3
 {s,ş,m} | a12e3
 {p,l,t} | a12e3
 {t,m,s} | a12e3
 {n,t,l} | a12e3
 {s,ş,m} | 1u2á3
 {p,l,t} | 1u2á3
 {t,m,s} | 1u2á3
 {n,t,l} | 1u2á3
 {s,ş,m} |
 {p,l,t} |
 {t,m,s} |
 {n,t,l} |
 {s,ş,m} | 1e2é3
 {p,l,t} | 1e2é3
 {t,m,s} | 1e2é3
 {n,t,l} | 1e2é3
 {s,ş,m} |
 {p,l,t} |
 {t,m,s} |
 {n,t,l} |
 {s,ş,m} |
 {p,l,t} |
 {t,m,s} |
 {n,t,l} |
 {s,ş,m} |
 {p,l,t} |
 {t,m,s} |
 {n,t,l} |

And I want to convert it on-the-fly into something resembling this:

  root   | root_i | word_i
---------+--------+--------
 {s,ş,m} | 1u2u3a | suşuma
 {p,l,t} | 1u2u3a | puluta
 {t,m,s} | 1u2u3a | tumusa
 {n,t,l} | 1u2u3a | nutula
 {s,ş,m} | 1a2oi3 | saşoim
 {p,l,t} | 1a2oi3 | paloit
 {t,m,s} | 1a2oi3 | tamois
 {n,t,l} | 1a2oi3 | natoil
 {s,ş,m} | 1o2i3  | soşim
 {p,l,t} | 1o2i3  | polit
 {t,m,s} | 1o2i3  | tomis
 {n,t,l} | 1o2i3  | notil
 {s,ş,m} | a12e3  | asşem
 {p,l,t} | a12e3  | aplet
 {t,m,s} | a12e3  | atmes
 {n,t,l} | a12e3  | antel
 {s,ş,m} | 1u2á3  | suşám
 {p,l,t} | 1u2á3  | pulát
 {t,m,s} | 1u2á3  | tumás
 {n,t,l} | 1u2á3  | nutál
 {s,ş,m} | 1e2é3  | seşém
 {p,l,t} | 1e2é3  | pelét
 {t,m,s} | 1e2é3  | temés
 {n,t,l} | 1e2é3  | neşél

Where the word column is dynamically generated by replacing the digits in the root_i column with the character in that digit's index in the root column. I also need to drop queried rows that don't have an entry in both columns to reduce clutter in my output.

Can anyone help me devise a postgres function that will do the merging of character[] and text strings? The little bit of regex I need shouldn't be complicated, but I have no idea how to mix this with a query, or better yet, turn it into a function.

2 Answers 2

3
select
  root,
  root_i,
  translate(root_i, "123", array_to_string(root,'')) as word_i
NATURAL JOIN tbl_patterns
NATURAL JOIN tbl_patterns_triliteral
where root is not null and root_i is not null;
Sign up to request clarification or add additional context in comments.

1 Comment

Cool. And here I was thinking it'd require regex. A bit of fine-tuning and it works great, without the overhead of a user-defined function call.
1

I must admit to not liking to do much string manipulation in sql/plpgsql functions. Perl has an operator for replacing regexp matches with generated replacements, which works fairly nicely:

create or replace function splice_to_word(root text, root_i text)
  returns text strict immutable language plperl as $$
  my $roots = shift;
  my $template = shift;
  $template =~ s{(\d+)}{substr($roots,$1-1,1)}ge;
  return $template;
$$;

There is some nastiness in that postgresql arrays don't seem to be translated into Perl lists, so I've assumed the roots are passed in as a string, e.g.:

select root, root_i, splice_to_word(array_to_string(root, ''), root_i) from data

5 Comments

I was originally storing the roots in a string of the form 1-2-3 and the array made more sense conceptually; I also thought that it might make selecting individual characters simpler.
I agree that the array is a better storage form than the string. It's a pity that the Perl integration doesn't seem to handle it (it receives the string representation of the array as a parameter).
Hey, it works! I had to change the \d+ to \d because sequences need to be maintained, but it's a start.
ah, good point. I assume your data means you never have more than 10 elements in the root array.
Yeah, the root length is between two to five characters, but I've only written tables for two and three so far. Once I've got this whole thing up and working, it'll be relatively simple to insert an additional table if I need it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.