4

I need to use php to add a space between a period and the next word/letter when there's none.

For example, "This is a sentence.This is the next one." needs to become "This is a sentence. This is the next one." Notice the added space after the first period.

My problem is that even if I'm able to make a regular expression that finds every dot followed by a letter, how do I then replace that dot with a "dot + space" and keep the letter?

Also it needs to keep the case of the letter, lower or upper.

Thanks for your input.

2
  • Doing this via regex will lead to false results with things like This is a sentence."And is a quote." And this contains three dots... and all other kinds of situation where a dot is valid but a following space is wrong. You can't, in fact, parse natural language correctly with regular expressions, and even more sophisticated tools have a very hard time with that. Commented May 19, 2010 at 14:42
  • 1
    While this is true, it will at least allow me to correct some of the most obvious mistakes. Commented May 20, 2010 at 0:29

3 Answers 3

9
$regex = '#\.(\w)#';
$string = preg_replace($regex, '. \1', $string);

If you want to capture more than just periods, you can do:

preg_replace('#(\.|,|\?|!)(\w)#', '\1 \2', $string);

Simply add the characters you want replaced into the first () block. Don't forget to escape special characters (http://us.php.net/manual/en/regexp.reference.meta.php)

Sign up to request clarification or add additional context in comments.

4 Comments

Thank you very much! So basically the \1 is the variable containing the letter. Do you mind explaining this or pointing me somewhere I can better understand it?
\1 is what is captured in the first group (regex contained within the first group of parentheses). It's called a back reference.
That fails if you have something else than a word character after the dot, like the string "Foo.-Bar"
What if I want to do this for periods or commas. Now I need two back references right? How would that work? Thanks.
1
$str = "Will you please slow down?You're gonna kill someone.Seriously!";
echo preg_replace('/(!|\?|\.)([^\s\.\?!])/', '\1 \2', $str);

3 Comments

FYI, (!|\?|\.) can be written as [!?.], making it more efficient as well as more readable.
@Alan, you'd need to wrap the [] in () as well for the backreference to work ([!?.])...
@Alan Moore: True. But I wanted to see if ircmaxell would copy it verbatim (as you can see, I do use brackets in the second, albeit with superfluous espaces). lol jk
0
$str = "This is a sentence.This is the next one.";
echo preg_replace("#\.(\S)#", '. \1', $str);

2 Comments

That will erroneously turn "this..." into "this. .."
@ircmaxell, it's just what the OP asked; add a period if the following character isn't a space (I used a non-whitespace, tho). There are probably a lot of other situations where it won't work either, URIs for example.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.