1

I'm trying to insert a tab (\t) before a regex, in a string. Before "x days ago", where x is a number between 0-999.

The text I have looks like this:

Great product, fast shipping! 22 days ago anon
Fast shipping. Got an extra free! Thanks! 42 days ago anon

Desired output:

Great product, fast shipping! \t 22 days ago anon
Fast shipping. Got an extra free! Thanks! \t 42 days ago anon

I am still new to this, and I'm struggling. I've looked around for answers, and found some that are close, but none that are identical.

This is what I have so far:

text = 'Great product, fast shipping! 22 days ago anon'
new_text = re.sub(r"\d+ days ago", "\t \d+", text)
print new_text

Output:

Great product, fast shipping!    \d+ anon

Again, what I need is (note the \t):

Great product, fast shipping!    22 days ago anon

4 Answers 4

3

You can use backreferences in your replacement string. Put parantheses around the \d+ days ago to make it a captured group and use \\1 inside your replacement to refer to this group's text:

>>> text = 'Great product, fast shipping! 22 days ago anon'
>>> new_text = re.sub(r"(\d+ days ago)", "\t\\1", text)
>>> print new_text
Great product, fast shipping!    22 days ago anon
Sign up to request clarification or add additional context in comments.

Comments

1

You were replacing with a regex pattern, and you needed just a \1 backreference.

In order to just insert a tab before n days ago, you can use a look-ahead, and replace the captured number with a \t\1:

import re
p = re.compile(ur'(\d+)(?=\s+days\s+ago)')
test_str = u"Great product, fast shipping! 22 days ago anon\nFast shipping. Got an extra free! Thanks! 42 days ago anon"
subst = u"\t\\1"
print re.sub(p, subst, test_str)

Result of a demo:

Great product, fast shipping!   22 days ago anon
Fast shipping. Got an extra free! Thanks!   42 days ago anon

And a sample program.

Comments

1

You can use a lookahead to do zero width insertion and ' ' to find the leading literal space:

>>> import re
>>> txt='''\
... Great product, fast shipping! 22 days ago anon
... Fast shipping. Got an extra free! Thanks! 42 days ago anon'''
>>> repr(re.sub(r' (?=\d+)', ' \t', txt))
"'Great product, fast shipping! \\t22 days ago anon\\nFast shipping. Got an extra free! Thanks! \\t42 days ago anon'"

Note that all the patterns fitting ' \d+' become ' \t\d+' which is what I think you are after.

If you want to limit to ' \d+ days ago'' just add that to the lookahead:

>>> txt='''\
... Great product, fast shipping! 22 days ago anon
... Fast shipping. Got an extra free! Thanks! 42 weeks ago anon'''
>>> repr(re.sub(r' (?=\d+ days ago)', ' \t', txt))
"'Great product, fast shipping! \\t22 days ago anon\\nFast shipping. Got an extra free! Thanks! 42 weeks ago anon'"

Comments

0

you can use

Tabindex = re.search(r"\d days ago",text).start()
text = text[0:Tabindex]+'\t'+text[Tabindex:len(text)]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.