3

Now I've got very basic regex skills, only used regex a couple of times for basic stuff. This has probably been asked before, which I apologize for, but I couldn't find any answer for this. Found similar, though and tried to adapt it but to no avail. OK, to the question - How do I replace a space only between certain characters (doublequotes in this case)?

Say i have the following string:

"mission podcast" modcast A B C "D E F"

I want to replace the spaces between mission and podcast as well as the ones between D, E & F whilst leaving the other ones untouched.

P.S. What if space was a string? An example for that is welcome as well.

Edited this a bit I hope now it's more clear. Edit 2: I need to do this on a string in php and execute it in the shell. Edit 3: I'm sorry i changed the whole question 3 times it's just i'm getting quite confused myself. Cheers!

11
  • 1
    So what have you tried so far? Commented Jul 2, 2013 at 17:23
  • can you provide more example other than find /vol_stor/8s8a912hj1 | grep ""mission\|podcast"" | grep "modcast" ? at least 2 variant will be much help Commented Jul 2, 2013 at 17:27
  • well, i haven't tried anything since i don't know how to preserve part of the matched string within and only replace what's within the words which are between the brackets. So, I am open to all kinds of suggestions :) Commented Jul 2, 2013 at 17:29
  • replace this regex (")("\w+)(.*?)(\w+")(") with $2 $4 Commented Jul 2, 2013 at 17:35
  • 2
    Why is PHP one of the tags? Commented Jul 2, 2013 at 18:07

3 Answers 3

2

Description

I would attack this problem by first splitting the string into groups of either quoted or not quoted strings.

Then iterating through the matches and if Capture Group 1 is populated, then that string is quoted so just do a simple replace on replace Capture Group 0. If Capture group 1 is not populated then skip to the next match.

On each iteration, you'd want to simply build up a new string.

Since splitting the string is the difficult part, I'd use this regex:

("[^"]*")|[^"]*

enter image description here

Example

Sample Text

"mission podcast" modcast A B C "D E F"

Code

PHP Code Example: 
<?php
$sourcestring="your source string";
preg_match_all('/("[^"]*")|[^"]*/i',$sourcestring,$matches);
echo "<pre>".print_r($matches,true);
?>

Capture Groups

$matches Array:
(
    [0] => Array
        (
            [0] => "mission podcast"
            [1] =>  modcast A B C 
            [2] => "D E F"
            [3] => 
        )

    [1] => Array
        (
            [0] => "mission podcast"
            [1] => 
            [2] => "D E F"
            [3] => 
        )

)

PHP Example

This php script will replace only the spaces inside quoted strings.

Working example: http://ideone.com/jBytL3

Code

<?php

$text ='"mission podcast" modcast A B C "D E F"';

preg_match_all('/("[^"]*")|[^"]*/',$text,$matches);

foreach($matches[0] as $entry){
    echo preg_replace('/\s(?=.*?")/ims','~~new~~',$entry);
    }

Output

"mission~~new~~podcast" modcast A B C "D~~new~~E~~new~~F"
Sign up to request clarification or add additional context in comments.

8 Comments

Thank you very much for the answer and for taking the time to illustrate it! And as it would do perfect, i was hoping i could avoid splitting into arrays. Do you have any suggestion with preg_replace instead?
actually it was the foreach statement i was trying to avoid, but, anyways this does perfectly. Thank you very much!
What if the space was a string? Can i match a whole string?
nevermind i figured it out. if any noob like me is wondering, the \s in echo preg_replace('/\s(?=.*?")/ims','~~new~~',$entry); should be replaced with the string you want to replace. Thanks very much everyone!
Could you explain the (?=.*?") part in the last regex?
|
0

If you don't need to use regular expressions, here is an iterative version that works:

<?php
    function remove_quoted_whitespace($str) {
        $result = '';
        $length = strlen($str);
        $index = 0;
        $in_quotes = false;

        while ($index < $length) {
            $c = $str[$index++];

            if ($c == '"') {
                $in_quotes = !$in_quotes;
            } else if ($c == ' ') {
                if ($in_quotes) {
                    continue;
                }
            }

            $result .= $c;
        }

        return $result;
    }

    $input = '"mission podcast" modcast A B C "D E F"';
    $output = remove_quoted_whitespace($input);

    echo $input . "\n";
    echo $output . "\n";
?>

2 Comments

yes, but isn't iterating more resource intensive, doesn't it take longer?
Just ran a head-to-head test and (against my intuition) the regular expression implementation is indeed faster. I'm chalking it up to the difference between native code (the PCRE extension is implemented in C) vs. interpreted PHP code.
0

The entire foreach is not needed at all! It is possible to use a one-liner for this.

Here is the code which replaces spaces in quoted strings. The idea is that if a space is inside quotes, it is followed by odd number of quotes. It can be done by regexp look-ahead.

echo preg_replace('{\s+(?!([^"]*"[^"]*")*[^"]*$)}',"x",$str);

That's all! How it works? It matches all \s characters which are not followed by even number of quotes. The matching spaces get replaced by x. You can of course change it to any desired value or leave it empty.

1 Comment

this stops working if you don't have a closing quote somewhere, so imagine a string that has multiple quoted text segments, and you leave a closing quote off, it goes wacky

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.