-1

I have a string like A18.30 and I want to compare if this falls under these 2 min and max values: A17.0 and A19.0.

  1. A18.30 => A17.0 between A19.0 [correct]
  2. P1A.5 => P1.0 between A2.0 [In-correct]

Expectation is first letter needs to compared and remaining numeric digits, if it falls into the range to be compared.

I have tried with substring, but I want to know if there is a better way to do the comparison with regex. I use snowflake db.

2
  • 3
    This is a perfect example of poor table design. If you need to do numeric comparisons, use numeric types. What you're attempting to do now is string comparisons, and they will not work as you are expecting without jumping through large hoops several times. Commented Apr 15 at 4:58
  • 1
    There are no numeric values in the question.P1A.5 is just a string and yes it's larger than the string P1.0, because A comes after .. If you have a composite value, split it into separate text and numeric fields Commented Apr 15 at 7:16

2 Answers 2

0

It looks like the values that are compared are sort of "versions" which requires "natural sort". It can be achieved with SQL:

SELECT value, lower_limit, upper_limit,
  TRANSFORM(REGEXP_EXTRACT_ALL(value,'(\\d+|[^\\d]+)'),e->IFF(e RLIKE '\\d+',e::INT,e)) 
  BETWEEN TRANSFORM(REGEXP_EXTRACT_ALL(lower_limit,'(\\d+|[^\\d]+)'),e->IFF(e RLIKE '\\d+',e::INT,e))
      AND TRANSFORM(REGEXP_EXTRACT_ALL(upper_limit,'(\\d+|[^\\d]+)'),e->IFF(e RLIKE '\\d+',e::INT,e)) 
  AS is_value_between_lower_upper_limit
FROM VALUES ('A18.30','A17.0','A19.0'),
            ('P1A.5', 'P1.0', 'A2.0') AS s(value, lower_limit, upper_limit);

Output:

enter image description here

More at: How to sort "version" strings with SQL in Snowflake?


Alternatively by using UDF:

create or replace function natural_sort(str TEXT)
returns array
language python
runtime_version = '3.11'
packages = ('natsort')
handler = 'sort'
as
$$
import natsort
def sort(str):
  return list(natsort.natsort_key(str))
$$;

SELECT value, lower_limit, upper_limit,
 natural_sort(value) BETWEEN natural_sort(lower_limit) 
                     AND natural_sort(upper_limit) AS is_value_between_lower_upper_limit
FROM VALUES ('A18.30','A17.0','A19.0'),
            ('P1A.5', 'P1.0', 'A2.0') AS s(value, lower_limit, upper_limit);

Output:

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

0

As Ken White mentioned in his comment above, your table design is not optimal. You should ideally be storing each numerical component in a bona-fide numeric column of some sort. Then, the comparison would be fairly trivial. However, we could workaround as follows:

SELECT *
FROM yourTable
WHERE CAST(REGEXP_REPLACE(str, '^[A-Z]+', '') AS NUMBER(12,2)) BETWEEN 17.0 AND 19.0;

Here we are stripping off the leading letter(s) from the string column, then casting to numeric, and finally doing a BETWEEN range comparison.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.