0

Helllo there, I've been playing around with a materialised view, and my goal is to pre-process as much information as I can, aggregating the info into this view that would speed up the data analysis.

As a such, I'd like to use a CASE expression with an input condition and two assignments if the condition is satisfied.

Something like

SELECT 
    CASE 
        WHEN t."page" ~* '...' THEN t.type = 'Product', t.something = 'something else here'
END AS "page_type"

Reading through the documentation, but also checking for several examples I couldn't see this doable. Am I missing something?

Assuming is not, the only thing I can think of is using a function that returns an array or something similar. Would that be an a possible approach? If so, any example I can be pointed out to learn a decent DDL pattern?

Thanks

UPDATE To support @a_horse_with_no_name answer

SELECT * 
FROM "V_2019-06-10_to_2019-06-20_GSC" t,
            (
                SELECT
                    Regexp_replace(t."page", '...', '\1') as node
                    FROM "V_2019-06-10_to_2019-06-20_GSC" i
                    WHERE i.page = t.page
            ) AS "IFs"

The above generates an error in referencing the t table. If I go for a CROSS JOIN LATERAL, the query works, but then the GROUP BY I have at the end of the outer query cannot dedup anymore the results

UPDATE 2

Here's a simplification of the query. It's of course much more complex than this, but hopefully it suffice.

SELECT -- t."page", 
    "URL".node, t.area, "URL".page_type,
    CASE
        WHEN "URL".page_type = 'ABC' THEN <here I need my value>
        ELSE ''
    END AS "Node_Type",
    FROM "V_2019-06-10_to_2019-06-20_GSC" t
    CROSS JOIN LATERAL (
        SELECT 
            Regexp_replace(t."page", '...', '\1\2\3\4\5\6') as node, 
            CASE 
                WHEN t."page" ~* '...' THEN 'ABC'
                WHEN t."page" ~* 'filter' THEN 'DEF' 
                ELSE 'Other'
            END AS "page_type" 
    ) AS "URL"
    GROUP BY "URL".node, t.area, "URL".page_type
;

A URL processed by this query could look like this /level1/level2/level3/?filter=123

Idealy I'd like to:

  • capture level1/level2/level3 with the Regexp_replace, and I have the regexpr working for this.
  • I need to identify once again whether the filter is included in the page to populate the page_type, and the case above should be ok.
  • capturing 123 as part of the outer CASE "Node_Type", but I cannot get this done, hence my first thought of a CASE processing two outputs.

Is there an effective way to get this out?

0

1 Answer 1

1

A single case expression can only return a single value (=expression). So yes, you are right you can't do that. If you want to have two output columns, you need two case expressions.

SELECT CASE 
         WHEN t.page ~* '...' THEN 'Product here'
         ... other outputs ....
        END AS page_type, 
        CASE 
         WHEN t.page ~* '...' THEN 'something else'
         ... other outputs ....
        END AS something _else
FROM ...

If you don't want to repeat the condition, you could wrap that into a derived table:

SELECT node, area, page_type, 
       CASE
          WHEN page_type = 'ABC' THEN 'Product'
          WHEN page_type = 'DEF' THEN 'Something else'
          ELSE ''
       END AS "Node_Type"
from (
    SELECT area, 
           Regexp_replace(t."page", '...', '\1\2\3\4\5\6') as node, 
           CASE 
             WHEN t.page ~* '...' THEN 'ABC'
             WHEN t.page ~* 'filter' THEN 'DEF' 
             ELSE 'Other'
           END AS page_type, 
           string_to_array(trim(substr(page, 1, strpos(page, '?')-1),'/'), '/') as levels,
           regexp_replace(page, '(^.*\?filter=)([0-9]+)', '\2', 'g') as filter_value
           length(t.page) as string_length
    FROM "V_2019-06-10_to_2019-06-20_GSC" 
) t;

The expression string_to_array(trim(substr(page, 1, strpos(page, '?')-1),'/'), '/') will return the individual elements (levels) of the URL as array elements. '/level1/level2/level3/?filter=123' will yield and array {level1,level2,level3) and '/level1/level2/level3/level4/?filter=123' will return 4 elements. If you need to access the number of levels in the outer select, you can use e.g. cardinality(levels)

The expression regexp_replace(page, '(^.*\?filter=)([0-9]+)', '\2', 'g') will return the value of the parameter filter as a single expression (It's probably not 100% fool-proof, but URL parsing with regex is not a simple task).

6
  • please see above. I've added a simplified query to help understanding the context. Is this enough? Commented Jul 8, 2019 at 14:24
  • I have done that because I have in reality two cross lateral, one of them to calculate the string length, and which is also used by the first case. I've stripped that out to concentrate on this problem specifically. Commented Jul 8, 2019 at 14:47
  • @AndreaMoro: I don't see how you need a cross join at all. Just do all that in the derived table. Commented Jul 8, 2019 at 14:49
  • a_horse_with_no_name, I need that because the length is in reality calculated on the "node". This is because I do some string manipulation that is affecting the length. In fact, I am after counting the number of "/" which in the t.page is much longer and it will not portrait the right figure. With the regexpr replacement, I'm factoring some changes, that's why I opted that way. Commented Jul 8, 2019 at 14:57
  • @AndreaMoro: if that is used to get the number of elements in the URL, see my edit. This can be done simpler by parsing the URL into an array. Commented Jul 8, 2019 at 15:01

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.