Your task
Given a simple regular expression, you have to count how many strings of length n have a match of length n with the given simple regex. This will just be a subset of regexs. Like, no lookaheads or named groups or recursion or whatever weird things regexs have.
Simple regular expression
For the purposes of this challenge, a regex is said to be simple if, for one, it only contains characters in the ASCII range 32-126. Furthermore, it should only use the following functionalities:
- match literal characters, much like the regex
abcwould only match the string "abc"; - match options, like
abc|defwould match "abc" and "def"; - match exactly 0 or 1 occurrence of something, e.g.
https?matches "http" and "https"; - match 1 or more occurrences of something, e.g.
ah+would match "ah", "ahh", "ahhh", "ahhhh", etc; - match any amount of occurrences of something, e.g.
1*matches "", "1", "11", "111", "1111", etc; - match between
nandmoccurrences of something, e.g.lo{1,4}lmatches only "lol", "lool", "loool" and "looool". Ifnis ommited, it matches up tomoccurrences. Ifmis ommited, it matches at leastnoccurrences. Assume at least one ofnormis present; - use
()to group, e.g.ab(c|d)efwould match "abcef" and "abdef" (c.f. 2nd item in this list) or(10)+would match "10", "1010", "101010", "10101010", etc; - use
.to match any character (in the ASCII range[32, 126]), soab.would match "abc", "ab9", "ab)", etc; - use
\to escape the special meaning of a character, e.g.ab?would match "a" and "ab", whileab\?only matches "ab?"; - use
[]as a group of possible characters. Inside the brackets, all characters lose their special behaviours, except for-and\. This means that, for one,ab[cde]is shorthand forab(c|d|e)and secondly,ab[?+*]matches "ab?", "ab+" and "ab*"; also related to[]: - use
-to specify a character range within brackets. The ranges you have to support area-z,A-Zand0-9, as well as their subsets, likeh-zor3-8. E.g., the regexab[c-g]matches "abc", "abd", "abe", "abf" and "abg"; Note that-has no special meaning outside of[]soa-zwould only match "a-z".
Input
The input for your program/function/routine/etc should be a string representing the regex and an integer n. For the regex, you can further assume:
- all characters that show up are in the ASCII range
[32, 126] - if
{n,m}is used, then \$n \leq m \$ - if
-is used inside[]then the specified range is well-formed
Output
The number of strings of length n that match the given regex. You only have to account for characters in the ASCII range [32, 126].
Test cases
".*", 0 -> 1
".*", 1 -> 95
".*", 2 -> 9025
".*", 3 -> 857375
".*", 4 -> 81450625
"abc", 2 -> 0
"abc", 4 -> 0
"ab|ac|ad", 2 -> 3
"a(b|c)", 2 -> 2
"hell(o|oo)", 5 -> 1
"https?", 5 -> 1
"ho{1,4}ly", 6 -> 1
"ho{3,}ly", 137 -> 1
"[abcde]{,2}", 2 -> 25
"(10)+", 7 -> 0
"(10)+", 8 -> 1
"ab\?", 3 -> 1
"[t7]h[i1][s5] is c[0o]d[Ee3] g[0oO][l1L]f", 17 -> 432
"\+351 9[1236] [0-9]{3,3} [0-9]{2,2} [0-9][0-9]", 17 -> 40000000
"-", 1 -> 1
"\\", 1 -> 1
"[+?*]", 1 -> 3
"Abc([d-z]*|(.H)+)", 11 -> 5132188812826241
"ab|ab", 2 -> 1
".(.(.(.(.|a))))|hello", 5 -> 7737809375
This is code code-golf so shortest solution in bytes, wins. If you like this challenge, consider upvoting it... And happy golfing!
lambda s,n:4\$\endgroup\$4. :P \$\endgroup\${n:m}withoutn(i.e."[abcde]{:2}", 2 -> 25) \$\endgroup\$