1

My function finds in string hex notation (hexadecimal CSS colors) and replaces with the short notation.
For example: #000000 can be represented as #000

import re

def to_short_hex (string):
    match = re.findall(r'#[\w\d]{6}\b', string)

    for i in match:
        if not re.findall(r'#' + i[1] + '{6}', i):
            match.pop(match.index(i))

    for i in match:
        string = string.replace(i, i[:-3])

    return string;

to_short_hex('text #FFFFFF text #000000 #08088')

Out:

text #FFF text #000 #08088

Is there any way to optimize my code using list comprehension etc..?

3
  • 2
    There is a recipe at ActiveState using a slightly longer regex. code.activestate.com/recipes/… Commented Feb 11, 2012 at 14:25
  • @John P, more thax for the link! Commented Feb 11, 2012 at 14:31
  • @JohnP I feel you should really post that as an answer. Commented Feb 11, 2012 at 14:39

3 Answers 3

3

How about this? You can speed it up embedding is6hexdigit into to_short_hex, but I wanted it to be more readable.

hexdigits = "0123456789abcdef"

def is6hexdigit(sub):
    l = sub.lower()
    return (l[0] in hexdigits) and (l.count(l[0]) == 6)

def to_short_hex(may_have_hexes):
    replaced = ((sub[3:] if is6hexdigit(sub[:6]) else sub)
                        for sub in may_have_hexes.split('#'))
    return '#'.join(replaced)
Sign up to request clarification or add additional context in comments.

6 Comments

@GoingTham, it is a module name, and probably should be replaced for that reason.
@Ricardo Cárdenes, thx. I can be mistaken, but it less readable for me :)
Right, early in the morning here.
I used string just because OP did, and the string module is not used in the functions anyway, but yeah, I agree that it's not the ideal.
@AlexanderGuiness: whenever you start using comprehensions things tend to go down the readable way ;), but try substituting the whole is6hexdigit into the (... if ... else ...) and you'll get my point :D
|
2

This is what re.sub is for! It's not a great idea to use a regex to find something and then do a further sequence of search-and-replace operations to change it. For one thing, it's easy to accidentally replace things you didn't mean to, and for another it does a lot of redundant work.

Also, you might want to shorten '#aaccee' to '#ace'. This example does that too:

def to_short_hex(s):
    def shorten_match(match):
        hex_string = match.group(0)
        if hex_string[1::2]==hex_string[2::2]:
            return '#'+hex_string[1::2]
        return hex_string
    return re.sub(r"#[\da-fA-F]{6}\b", shorten_match, s)

Explanation

re.sub can take a function to apply to each match. It receives the match object and returns the string to substitute at that point.

Slice notation allows you to apply a stride. hex_string[1::2] takes every second character from the string, starting at index 1 and running to the end of the string. hex_string[2::2] takes every second character from the string, starting at index 2 and running to the end. So for the string "#aaccee", we get "ace" and "ace", which match. For the string "#123456", we get "135" and "246", which don't match.

1 Comment

You are right. I forgot about this notation: '#aaccee' to '#ace'
1

Using pop on a list while iterating over it is always a bad idea. Hence this isn't an optimization, but a correction of a bug. Also, I edited the re to prevent recognition of strings like '#34j342' from being accepted:

>>> def to_short_hex(s):
...     matches = re.findall(r'#[\dabcdefABCDEF]{6}\b', s)
...     filtered = [m for m in matches if re.findall(r'#' + m[1] + '{6}', m)]
...     for m in filtered:
...         s = s.replace(m, m[:-3])
...     return s
... 
>>> to_short_hex('text #FFFFFF text #000000 #08088')
'text #FFF text #000 #08088'

Also, I think re.search is sufficient in the second re.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.