0

I am trying to extract all valid hexadecimal value which represent color in a CSS code.

Specifications of HEX Color Code

  1. It must start with a '#' symbol.
  2. It can have 3 or 6 digits.
  3. Each digit is in range 0-F or 0-f.

Here is the sample input

#BED
{
    color: #FfFdF8; background-color:#aef;
    font-size: 123px;
    background: -webkit-linear-gradient(top, #f9f9f9, #fff);
}
#Cab
{
    background-color: #ABC;
    border: 2px dashed #fff;
}

Sample output

#FfFdF8
#aef
#f9f9f9
#fff
#ABC
#fff

Explanation

#BED and #Cab satisfy the Hex Color Code criteria, but they are used as selectors and not as color codes in the given CSS. So the actual color codes are

#FfFdF8
#aef
#f9f9f9
#fff
#ABC
#fff

What I tried in python

import re
pattern = r'^#([A-Fa-f0-9]{3}){1,2}$'
n = int(input())
hexNum = []
for _ in range(n):
   s = input()
   if ':' in s and '#' in s:
       result = re.findall(pattern,s)
       if result:
           hexNum.extend(result)
for num in hexNum:
    print(num)

When I'm running the above code on the sample input, it is printing nothing. So what's wrong I'm doing here? Is it the matching pattern? Or is it the logic I'm applying?

Please somebody explain me!

5
  • 1
    Your pattern is anchored with ^ and $. So it will only match if the entire string is a single hex number, it won't match in the middle of the string. Commented May 24, 2021 at 16:20
  • Thanks @Barmar, I have changed the pattern to r'(#([A-Fa-f0-9]{3}){1,2})', now it is matching #FfFdF8 but also dF8. Now what I have to do so that it will match only #FfFdF8? Commented May 24, 2021 at 16:30
  • I don't see how it can match df8 when it doesn't begin with #. Commented May 24, 2021 at 16:33
  • Get rid of the capture groups, you don't need that with findall(). It's returning the groups instead of the entire match. Commented May 24, 2021 at 16:34
  • So, what should I use instead of findall() to get the desired output? Commented May 24, 2021 at 16:40

2 Answers 2

1

Get rid of the anchors ^ and $, since they make it only match the entire input line.

Get rid of the capture groups, so that re.findall() will just return whole matches, not the group matches. Use (?:...) to create a non-capturing group so you can use the {1,2} quantifier.

pattern = r'#(?:[A-Fa-f0-9]{3}){1,2}'
Sign up to request clarification or add additional context in comments.

Comments

1

You have a two or three part problem:

  1. Remove CSS comments, which often contain code-looking stuff (optional, but recommended)
    • Regex matching comments is /\*.*?\*/
  2. Only look inside curly braces (e.g. not at selectors)
    • Regex matching curly braces is \{.*?\}
  3. find color codes
    • Regex for color codes is #(?:[A-Fa-f0-9]{3}){1,2}

Bringing it all together:

import re
def color_codes(css_text):
    codes = []
    # remove comments
    css_text = re.sub(r'/\*.*?\*/', '', css_text, re.S)
    # consider only {} blocks
    for block in re.finditer(r'\{.*?\}', css_text, re.S):
        # find color codes
        codes.extend(re.findall(r'#(?:[A-Fa-f0-9]{3}){1,2}', block.group(0)))
    return codes

Note: This is probably not a fool-proof solution. For that, you'd want to switch from simple regex to a full parser. But it's close enough if you just need something quick and don't mind some edge cases.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.