0

I currently use this piece of code to reduce a given text to a valid "tagging" format (only lowercase, a-z and minus allowed) by removing/replacing invalid characters

        $zip_filename = strtolower($original);
        $zip_filename = preg_replace("/[^a-zA-Z\-]/g", '-', $zip_filename); //replace invalid chars
        $zip_filename = preg_replace("/-+/g", '-', $zip_filename); // reduce consecutive minus to only one
        $zip_filename = preg_replace("/^-/g", '', $zip_filename); // removing leading minus
        $zip_filename = preg_replace("/-$/g", '', $zip_filename); // remove trailing minus

Any hints on how to put at least the regex into a single one?

Thanks for any advice!

1
  • Why wouldn't $zip_filename = preg_replace("/[^a-zA-Z\-]/g", '', $zip_filename); //replace invalid chars with nothing work? Commented Jul 7, 2010 at 20:57

3 Answers 3

3
$zip_filename = trim(preg_replace("/[^a-z]+/", '-', $zip_filename),'-');

Explanation:

  1. A-Z is useless since it should be lower case
  2. Adding + after right bracket will replace one or more consecutive invalid chars
  3. Using trim with second parameter - character to trim form beginning and end will speed up the code
  4. Removing \- from preg_replace will also take car of hyphens between invalid chars / multiple consecutive hyphens, replacing them to single one.
Sign up to request clarification or add additional context in comments.

5 Comments

That doesn't take care of multiple consecutive hyphens, I don't think.
@Platinum - You're right. It also doesn't take care of consecutive spaces, or leading or trailing dashes.
Platinum is right. 0ßDie--s §$"%& idsfg sdg$ will result in die--s-idsfg-sdg (one hyphen too much)
Ah, I like this better than mine now, especially since I'm not a php guy and didn't know about the optional parameter for the trim function. I do think you need to perform one more trim for spaces. The original code effectively eliminated leading and trailing spaces as well... I think that's the only thing yours doesn't do now.
Nevermind, you can disregard that. You're replacing spaces with dashes, and then trimming the dashes. It's a little bit redundant processing but not bad at all, and very easy to look at.
0

You'll only make the code more difficult to understand/maintain by combining these four operations into one.

You also don't need the complexity and performance hit of regular expression-based operations to achieve what you need.

The reduction of double to single minus signs may be more readily achieved with a looped call to str_replace:

while (substr_count($zip_filename, '--')) {
    $zip_filename = str_replace('--', '-', $zip_filename);
}

Wrapping this up in a well-named class method will abstract away any apparent complexity and aid the readability of the code.

The last two operations can be handled by the trim() function:

$zip_filename = trim($zip_filename, '-');

You can then replace your regular expression-based operations with something less cpu aggressive and arguably easier for others to understand:

//replace invalid chars
$zip_filename = preg_replace("/[^a-zA-Z\-]/g", '-', strtolower($original)); 

// reduce consecutive minus to only one
while (substr_count($zip_filename, '--')) {
    $zip_filename = str_replace('--', '-', $zip_filename);
}

// remove leading and trailing minus
$zip_filename = trim($zip_filename, '-'); 

Comments

0

This should simplify it considerably...

$zip_filename = trim(strtolower($original));
$zip_filename = preg_replace("/\s\s+|--+|[^a-zA-Z-]/g", '-', $zip_filename);

The trim will take care of the spaces before and after the string. Also note the \s\s+ and --+. These are more efficient at finding duplicates. They'll only match those characters if there are 2 or more in succession, therefore avoiding unnecessary replacement operations.

But technically it'd still be possible to have leading or trailing dashes. And for that you'd still need this...

$zip_filename = preg_replace("/^-|-$/g", '', $zip_filename);

(This last operation couldn't really be combined with the other since you're using a different replacement string.)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.