1

Is there any way to compare two columns with strings to each other, and getting the matches?

I have two columns containing Names, once with the Full Name the other with (mostly) just the Surname.

I just tried it with soundex, but it will just return if the values are almost similar in both columns.

SELECT * FROM TABLE
WHERE soundex(FullName) = soundex(Surname)
1   John Doe       Doe 
2   Peter Parker   Parker
3   Brian Griffin  Brian Griffin

with soundex it will only match the 3rd line.

1
  • SOUNDEX() is an outdated function that doesn't really have good use in this century. Commented May 25, 2020 at 14:06

3 Answers 3

1

A simple option is to use instr, which shows whether surname exists in fullname:

SQL> with test (id, fullname, surname) as
  2    (select 1, 'John Doe'     , 'Doe'           from dual union all
  3     select 2, 'Peter Parker' , 'Parker'        from dual union all
  4     select 3, 'Brian Griffin', 'Brian Griffin' from dual
  5    )
  6  select *
  7  from test
  8  where instr(fullname, surname) > 0;

        ID FULLNAME      SURNAME
---------- ------------- -------------
         1 John Doe      Doe
         2 Peter Parker  Parker
         3 Brian Griffin Brian Griffin

Another option is to use one of UTL_MATCH functions, e.g. Jaro-Winkler similarity which shows how well those strings match:

SQL> with test (id, fullname, surname) as
  2    (select 1, 'John Doe'     , 'Doe'           from dual union all
  3     select 2, 'Peter Parker' , 'Parker'        from dual union all
  4     select 3, 'Brian Griffin', 'Brian Griffin' from dual
  5    )
  6  select id, fullname, surname,
  7    utl_match.jaro_winkler_similarity(fullname, surname) jws
  8  from test
  9  order by id;

        ID FULLNAME      SURNAME              JWS
---------- ------------- ------------- ----------
         1 John Doe      Doe                   48
         2 Peter Parker  Parker                62
         3 Brian Griffin Brian Griffin        100

SQL>

Feel free to explore other function that package offers.


Also, note that I didn't pay attention to possible letter case differences (e.g. "DOE" vs. "Doe"). If you need that as well, compare e.g. upper(surname) to upper(fullname).

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks a lot, this seems to work for me perferctly :) (as you mentionend UPPER i really got more results by adding it to my select)
0

Please use instring function,

SELECT * FROM TABLE
WHERE instr(Surname, FullName) > 0;


SELECT * FROM TABLE
WHERE instr(upper(Surname), upper(FullName)) > 0;

SELECT * FROM TABLE
WHERE upper(FullName) > upper(Surname);

5 Comments

Thanks a lot, this seems to work for me too :) (i got more results by adding UPPER for case sensitivity)
If you feel its correct, please mark as correct, so that others will be helpful. Have provided you other ways of achieving it
The last query doesn't make much sense. 'JOHN SMITH' > 'ABE LINCOLN', because 'J' > 'A', so it's a match??? As to the other queries: INSTR('JOHN', 'MARC JOHNSSON') = 6`, so that's a match, too?
Its not a Match, < and > operator serves as a comparison. If you wanted to check if a string is available in another column you can use these operators. Just try it, you can understand
Okay, I tried it. I suppose that you confused the positions in INSTR, so I also added your expressions with switched parameters. Here is a demo with six instead of your three expressions; none of them works reliably: dbfiddle.uk/…
0

As far as I know there is nothing out of the box when matching becomes complicated. For the cases shown, however, the following expression would suffice:

where fullname like '%' || surname

Update

The main problem may be false positives:

  • The last name 'Park' appears in 'Peter Parker'. Above query solves this by looking at the full name's end.

Another problem may be upper / lower case as mentioned in the other answers (not shown in your sample data).

  • You want the last name 'PARKER' match 'Peter Parker'.

But when looking at the strings case insensitively, another problem arises:

  • The last name 'Strong' will suddenly match 'Louis Armstrong'.

A solution for this is to add a blank to make the difference:

where ' ' || upper(fullname) like '% ' || upper(surname)
  • ' LOUIS ARMSTRONG' like '% STRONG' -> false
  • ' LOUIS ARMSTRONG' like '% ARMSTRONG' -> true
  • ' LOUIS ARMSTRONG' like '% LOUIS ARMSTRONG' -> true

Demo: https://dbfiddle.uk/?rdbms=oracle_18&fiddle=0ac5c80061b4aeac1153a8c5976e6e54

2 Comments

thanks, this is also a very smart way to get matches
Thanks :-) As the accepted answer is wrong, but case sensitivity mentioned in the other two answers has its point, I've updated my answer to allow for case mismatches, too.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.