4

The common solution to turn multiple white spaces into one white space is by using regular expression like this:

preg_replace('/\s+/',' ',$str);

However, regex tends to be slow because it has to load the regular expression engine. Are there non-regex methods to do this?

5
  • 5
    Regexes aren't that slow.... I doubt you can find a faster method, expecially if white-spaces differ (space, newline, tab, etc.). Commented Feb 24, 2012 at 14:49
  • Anything besides a regular expression would be a hack, and therefor likely slower. Using the tools that come with the language is probably the best idea. You said regex tends to be slow. Have you benchmarked it? Unless you're dealing with huge amounts of data, you probably won't notice it. Commented Feb 24, 2012 at 14:53
  • You can iterate all chars and remove all white spaces after you found the first one, if they are sequencial, but I think this method is slower than regex... Commented Feb 24, 2012 at 14:56
  • 1
    @TecBrat wouldn't a larger data set mean it'd be faster, as the questions main concern is the time to load the Regex engine? I realize that it's an assumption. But does it seem like the concern here is a large number of small requests, each potentially having to load additional code for a small data set? Commented Feb 24, 2012 at 14:58
  • @Chris, I think I mis-understood the point. I think the point you are making is analogous to a photo copier's first copy speed vs it's pages per minute. The point of my comment still stands though, benchmarking is still a good idea to know if a work around is even needed. If the regex is already written, might as well try it out. Commented Feb 25, 2012 at 3:57

3 Answers 3

6

try

while(false !== strpos($string, '  ')) {
    $string = str_replace('  ', ' ', $string);
}
Sign up to request clarification or add additional context in comments.

1 Comment

@Bill nor any other type of white space for that matter.
4

Update

function replaceWhitespace($str) {
  $result = $str;

  foreach (array(
      "  ", " \t",  " \r",  " \n",
    "\t\t", "\t ", "\t\r", "\t\n",
    "\r\r", "\r ", "\r\t", "\r\n",
    "\n\n", "\n ", "\n\t", "\n\r",
  ) as $replacement) {
    $result = str_replace($replacement, $replacement[0], $result);
  }

  return $str !== $result ? replaceWhitespace($result) : $result;
}

compared to:

preg_replace('/(\s)\s+/', '$1', $str);

The handmade function runs roughly 15% faster on very long (300kb+) strings.

(on my machine at least)

4 Comments

Anyone got some time to test this out for performance comparison? Assuming purely spaces and not worrying about other whitespace.
Interesting, so if I'm reading that right the Regex is taking up to 6 times longer. The concern of the loading time for the Regex Engine isn't an issue as much as the preg_replace seems to be slower overall. (Again in the case of specifically just spaces)
@Chris Not quite, the preg_replace already takes care of different types of whitespace. If you'd replace the pattern by / +/ it runs as fast as the str_replace construct.
@Yoshi your code is a nice test, but if you change the sample, the ratio times will change: codepad.org/OP5n6Lca
1

Well you could use trim or str_replace methods provided by php.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.