4

I have a string like this:

a = "\"java jobs in delhi\" delhi"

I want to replace delhi with "". But only delhi which lies outside the double-quotes. So, the output should look like this:

"\"java jobs in delhi\""

The string is a sample string.The substring not necessarily be "delhi".The substring to replace can occur anywhere in the input string. The order and number of quoted and unquoted parts in the string is not fixed

.replace() replaces both the delhi substrings. I can't use rstrip either as it wont necessarily appear at the end of the string. How can I do this?

7
  • you want to do this multiple times, or just once? because you could select the substring by doing a[0:-6] Commented Jul 6, 2015 at 10:03
  • Have you considered a regular expression? Commented Jul 6, 2015 at 10:05
  • This is just a sample string. I feel regex is the way to go but I could not generate the regex for this. The string could be like "\"java jobs in pune\" pune" as well. So, I am basically looking for a generic solution. Commented Jul 6, 2015 at 10:12
  • Do you want to remove every thing after the third " Commented Jul 6, 2015 at 10:13
  • Will the names always occur at the end of the string/line or can they be before it too? Commented Jul 6, 2015 at 10:13

3 Answers 3

3

Use re.sub

>>> a = "\"java jobs in delhi\" delhi"
>>> re.sub(r'\bdelhi\b(?=(?:"[^"]*"|[^"])*$)', r'', a)
'"java jobs in delhi" '
>>> re.sub(r'\bdelhi\b(?=(?:"[^"]*"|[^"])*$)', r'', a).strip()
'"java jobs in delhi"'

OR

>>> re.sub(r'("[^"]*")|delhi', lambda m: m.group(1) if m.group(1) else "", a)
'"java jobs in delhi" '
>>> re.sub(r'("[^"]*")|delhi', lambda m: m.group(1) if m.group(1) else "", a).strip()
'"java jobs in delhi"'
Sign up to request clarification or add additional context in comments.

Comments

0

As a general way you can use re.split and a list comprehension :

>>> a = "\"java jobs in delhi\" delhi \"another text\" and this"
>>> sp=re.split(r'(\"[^"]*?\")',a)
>>> ''.join([i.replace('dehli','') if '"' in i else i for i in sp])
'"java jobs in delhi" delhi "another text" and this'

The re.split() function split your text based on sub-strings that has been surrounded with " :

['', '"java jobs in delhi"', ' delhi ', '"another text"', ' and this']

Then you can replace the dehli words which doesn't surrounded with 2 double quote!

Comments

0

Here is another alternative. This is a generic solution to remove any unquoted text:

def only_quoted_text(text):
    output = []
    in_quotes=False

    for letter in a:
        if letter == '"':
            in_quotes = not in_quotes
            output.append(letter)
        elif in_quotes:
            output.append(letter)

    return "".join(output)  


a = "list of \"java jobs in delhi\" delhi and \" python jobs in mumbai \" mumbai"

print only_quoted_text(a)

The output would be:

"java jobs in delhi"" python jobs in mumbai "

It also displays text if the final quote is missing.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.