0

I have a list of urls which have varying numbers in the end after an & sign. I'm not able to apply a regex to remove these numbers (including the &) from the url as there are multiple & in the string and the re.sub('&\d*',"",x) command filters all of the & including the one I want to remove.

The url is: http://helloworld.com?p1=123&p2=987&hello=world&123456

The desired output I want is: http://helloworld.com?p1=123&p2=987&hello=world

1
  • Also as a side note I think you mean to be using forward slashes in your urls like this http:// not back slashes. Commented Nov 15, 2014 at 8:55

2 Answers 2

3

You can use an anchored pattern if you always want the last parameter:

re.sub(r'&\d+$',"",x)

The important piece is the dollar sign which says to only match at the end.

Also you should keep in mind that whenever you use * that can match the empty string. If you want to match a non-empty string, you need to use +.

Sign up to request clarification or add additional context in comments.

1 Comment

@Kasra your answer is wrong because you still may match something in the middle of the string.
2

You need + after \d for match! When you use &\d*, it first matches the middle &. Also, you need a $ to specify that your pattern is at the end of the string:

'http:\\helloworld.com?p1=123&p2=987&hello=world&123456'
                             ^

so use re.sub(r'(&\d+)$',"",x) instead of yours! See DEMO.

1 Comment

In general it is best to use raw strings when specifying regexes using r''

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.