Unable to find a pattern in a string using regular expression

Question

I am trying to extract all valid hexadecimal value which represent color in a CSS code.

Specifications of HEX Color Code

It must start with a '#' symbol.
It can have 3 or 6 digits.
Each digit is in range 0-F or 0-f.

Here is the sample input

#BED
{
    color: #FfFdF8; background-color:#aef;
    font-size: 123px;
    background: -webkit-linear-gradient(top, #f9f9f9, #fff);
}
#Cab
{
    background-color: #ABC;
    border: 2px dashed #fff;
}

Sample output

#FfFdF8
#aef
#f9f9f9
#fff
#ABC
#fff

Explanation

#BED and #Cab satisfy the Hex Color Code criteria, but they are used as selectors and not as color codes in the given CSS. So the actual color codes are

#FfFdF8
#aef
#f9f9f9
#fff
#ABC
#fff

What I tried in python

import re
pattern = r'^#([A-Fa-f0-9]{3}){1,2}$'
n = int(input())
hexNum = []
for _ in range(n):
   s = input()
   if ':' in s and '#' in s:
       result = re.findall(pattern,s)
       if result:
           hexNum.extend(result)
for num in hexNum:
    print(num)

When I'm running the above code on the sample input, it is printing nothing. So what's wrong I'm doing here? Is it the matching pattern? Or is it the logic I'm applying?

Please somebody explain me!

Your pattern is anchored with ^ and $. So it will only match if the entire string is a single hex number, it won't match in the middle of the string. — Barmar
– Barmar, Commented May 24, 2021 at 16:20
Thanks @Barmar, I have changed the pattern to r'(#([A-Fa-f0-9]{3}){1,2})', now it is matching #FfFdF8 but also dF8. Now what I have to do so that it will match only #FfFdF8? — ClassHacker
– ClassHacker, Commented May 24, 2021 at 16:30
I don't see how it can match df8 when it doesn't begin with #. — Barmar
– Barmar, Commented May 24, 2021 at 16:33
Get rid of the capture groups, you don't need that with findall(). It's returning the groups instead of the entire match. — Barmar
– Barmar, Commented May 24, 2021 at 16:34
So, what should I use instead of findall() to get the desired output? — ClassHacker
– ClassHacker, Commented May 24, 2021 at 16:40

Barmar · Accepted Answer · 2021-05-24 16:45:56Z

1

Get rid of the anchors ^ and $, since they make it only match the entire input line.

Get rid of the capture groups, so that re.findall() will just return whole matches, not the group matches. Use (?:...) to create a non-capturing group so you can use the {1,2} quantifier.

pattern = r'#(?:[A-Fa-f0-9]{3}){1,2}'

answered May 24, 2021 at 16:45

Barmar

789k57 gold badges555 silver badges669 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Pi Marillion · Accepted Answer · 2021-05-25 02:51:57Z

You have a two or three part problem:

Remove CSS comments, which often contain code-looking stuff (optional, but recommended)
- Regex matching comments is /\*.*?\*/
Only look inside curly braces (e.g. not at selectors)
- Regex matching curly braces is \{.*?\}
find color codes
- Regex for color codes is #(?:[A-Fa-f0-9]{3}){1,2}

Bringing it all together:

import re
def color_codes(css_text):
    codes = []
    # remove comments
    css_text = re.sub(r'/\*.*?\*/', '', css_text, re.S)
    # consider only {} blocks
    for block in re.finditer(r'\{.*?\}', css_text, re.S):
        # find color codes
        codes.extend(re.findall(r'#(?:[A-Fa-f0-9]{3}){1,2}', block.group(0)))
    return codes

Note: This is probably not a fool-proof solution. For that, you'd want to switch from simple regex to a full parser. But it's close enough if you just need something quick and don't mind some edge cases.

Collectives™ on Stack Overflow

Unable to find a pattern in a string using regular expression

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related