0

I am trying to filter some reporting results (Google Analytics - Javascript regex support) to only include rows that contain the pattern "OA", "OA" cannot be the last characters in the string. My regex below solves for the "last characters in the string issue", but doesn't restrict the match to only those rows that have some instance of "OA" in them. Should I include another OR statement to capture that or should I update my current regex to account for that

E.g. Text (Expected results):

OA > OA //No Match
Paid Search > OA //No Match
Paid Search > (none) > Social //No Match
OA > Paid Search //Match
Social > OA > (none) > (none) //Match

Regex:

.{,2}$|.*[^OA]$
2
  • In GA, RE2 library is used. Otherwise, you could use OA(?!$). @HuStmpHrrr suggested (in a deleted answer) a OA. that should actually work for you. Commented Feb 22, 2018 at 19:09
  • @WiktorStribiżew I deleted it because that re will match the first example. it's necessary to explicitly decline OA at the end. your re seems simpler than mine. updated: actually no, that re behaves the same as what I proposed and will also match the first case. Commented Feb 22, 2018 at 19:15

2 Answers 2

1

what about the following:

OA.(?!.*OA$)

it requires additionally match another whatever char after OA, so it guarantees to not match the last OA instance; then it explicitly look ahead to match the end of string for not OA.

I do not program javascript so I don't know if your engine supports that. Locally I tested with grep using grep -P 'OA.(?!.*OA$)' and it works for your examples.


In the case of denying negative lookahead, you can spell out what negative lookahead would actually do:

(OA.*(O[^A]|[^O].)|OA.)$

The trick here is to come up with an automaton that solely denies OA at the end. If O is seen, then you don't want A but anything else; otherwise, any character will be acceptable. By formulating it in an RE explicitly, you will generate the first part of expression I proposed above.

The second part is a fix to fill in the gap. because the first part requires matching string to have length >= 4, the second part close the gap to eliminate the corner case to force the length of matching string goes down to >= 3, which achieves the same set of strings as negative lookahead implementation.

Sign up to request clarification or add additional context in comments.

5 Comments

Unfortunately the negative lookahead is not supported with analytics platforms engine
the second implementation should be a proper accommodation with the absence of support for lookahead.
This is great. Thank you so much for your explanation!
If there is another scenario where I have Paid Search Would I do something like (P[^aid Search]|[^P].)|Paid Search.)??
@cphill wow then it's complex. [] means a char set in regular expression. it match a single char not a string. you can do it definitely, but the length of re you need to write in order to simulate the same behavior as negative lookahead grows exponentially to the length of the string you want to deny. to this extent, I think it's better off if you can just grep all the string contains your target string, then filter out all those end with it, instead of hoping re will do all the work for you.
0

You could match OA and then make sure that the string does not end with OA:

^.*OA.*(?:[^O]A|O[^A]|[^O][^A])$

That would match

^          # Begin of the string
.*OA       # match any character zero or more times and match OA
.*         # Match any characters zero or more times
(?:        # Non capturing group
  [^O]A    # Match not O and A
  |        # or
  O[^A]    # Match O and not A
  |        # or 
  [^O][^A] # Match not O not A
)          # Close non capturing group
$          # End of the string

1 Comment

Negative lookahead is unfortunately not supported as a regex expression value in Google Analytics

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.