I've been playing around on psql and splitting a name in to an array, so for example:
select string_to_array('joseph jones', ' ');
string_to_array
-----------------
{joseph,jones}
This works exactly as I expected.
However, my dataset contains a lot of surnames that have a preceding 'o'.
select string_to_array('joseph o carroll', ' ');
string_to_array
-----------------
{joseph,o,carroll}
Is there any way I can add some extra logic so that if a word is preceded by a ' o ' then it gets bundled in to the following word?
So joseph o carroll would return {joseph,o carroll}
regexp_split_to_arraycould do that.select regexp_split_to_array('joseph o jones','(\s+)');but still trying to figure out how to exclude theofrom the splitselect regexp_split_to_array('joseph o jones','(?<!o)(\s+)');which nearly solves my problem but for some reason adds quotation marks aroundo jonesselect unnest(regexp_split_to_array(..))