1

I'm trying to compare two columns from two separate tables (Table A, Table B). The column in B is equal to the column in table A and it may have extra digits afterwards.

Example:

A.Col1 = 95792313

B.Col1 = 9579231300

Which means that B.Col1 = '%' + A.Col +'%'

A has around 3 million records and B has 15 million. How can I utilize regular expression in SQL in order to achieve that?

2
  • do you only need it to run once or multiple times? Commented Sep 8, 2016 at 19:30
  • Can you join the columns into a temp column? Then use a regex like ^(\d+)[ ]\1 to select. Commented Sep 8, 2016 at 19:38

2 Answers 2

3

How about something like this....

declare @a bigint = 95792313
declare @b bigint = 9579231300

select 1 where @a like left(@b,len(@a))

I'm just truncating B.Col1, which is @b in this case, to the length of A.Col1 which is @a in this case

So for you, something like

WHERE A.COL1 LIKE LEFT(B.Col1,LEN(A.Col1))

Sign up to request clarification or add additional context in comments.

6 Comments

Ah, just see that you posted the same idea earlier... +1 from my side!
Thank you so much, the solution works but still to slow for the amount of data I have
@WT86 with 3 million rows nothing is going to be fast since you have to conduct row level comparison on each table.
@WT86 is columnA a fixed length all the way down?
@scsimon Neither columns have a fixed length.
|
1

As your approach to compare these values is not a numerical one, but rather character based, the easiest would be to compare just A.Col1 with a snippet of the same length cut from the beginning of B.Col1.

Try this:

DECLARE @tblA TABLE(Col1 BIGINT);
DECLARE @tblB TABLE(Col1 BIGINT);

INSERT INTO @tblA VALUES(123),(1234),(12345);
INSERT INTO @tblB VALUES(12300),(12340000),(1345);

SELECT A.Col1
      ,B.Col1
      ,LEN(A.Col1)
      ,CASE WHEN A.Col1=LEFT(B.Col1,LEN(A.Col1)) THEN 'Start with the same digits' ELSE '' END
FROM @tblA AS A
CROSS JOIN @tblB AS B

The result

+----------+----------+--------------------+----------------------------+
| Col1     | Col1     | (Kein Spaltenname) | (Kein Spaltenname)         |
+----------+----------+--------------------+----------------------------+
| 123      | 12300    | 3                  | Start with the same digits |
+----------+----------+--------------------+----------------------------+
| 1234     | 12300    | 4                  |                            |
+----------+----------+--------------------+----------------------------+
| 12345000 | 12300    | 8                  |                            |
+----------+----------+--------------------+----------------------------+
| 123      | 12340000 | 3                  | Start with the same digits |
+----------+----------+--------------------+----------------------------+
| 1234     | 12340000 | 4                  | Start with the same digits |
+----------+----------+--------------------+----------------------------+
| 12345000 | 12340000 | 8                  |                            |
+----------+----------+--------------------+----------------------------+
| 123      | 1345     | 3                  |                            |
+----------+----------+--------------------+----------------------------+
| 1234     | 1345     | 4                  |                            |
+----------+----------+--------------------+----------------------------+
| 12345000 | 1345     | 8                  |                            |
+----------+----------+--------------------+----------------------------+

UPDATE

A CROSS JOIN with millions of rows in both tables is no good idea. This was just to illustrate the approach. You might use an INNER JOIN and put this code as the join's condition:

SELECT A.Col1
      ,B.Col1
      ,LEN(A.Col1)
      ,CASE WHEN A.Col1=LEFT(B.Col1,LEN(A.Col1)) THEN 'Start with the same digits' ELSE '' END
FROM @tblA AS A
INNER JOIN @tblB AS B ON A.Col1=LEFT(B.Col1,LEN(A.Col1))

4 Comments

I like what you are proposing, but how can I only print out the matched values only without the non matched ones? The result in my case would be very messy
@WT86, just add WHERE A.Col1=LEFT(B.Col1,LEN(A.Col1)) at the end of the query to put a filter on the set.
@WT86, and be aware, that a CROSS JOIN with multi-million rows is no good idea. See my update in a minute!
Thank you, that is working but it is literally taking for ever 6 hours passed

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.