1

I need a function that recognize all urls inside it and get it for manipulate, then recreate original string with urls modified.

tried:

old_msg = 'This is an url https://ebay.to/3bxNNfj e this another one https://amzn.to/2QBsX7t'

def manipulate_url(url):
    #example of manipulation, in real i get query replacement tags and other complex....
    if 'ebay' in url:
        new_url = url + "/another/path/"
    if 'amzn' in url:
        new_url = url + "/lalala/path/"
    return new_url

result = re.sub('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', manipulate_url, old_msg)
print(result)

#expected result based on my exmple:
#This is an url https://ebay.to/3bxNNfj/another/path/ e this another one https://amzn.to/2QBsX7t/lalala/path/

but i get : TypeError: sequence item 1: expected str instance, re.Match found

1
  • Trying to run this code on a fresh Python 3 interpreter gives TypeError: argument of type '_sre.SRE_Match' is not iterable Commented Apr 6, 2020 at 9:50

1 Answer 1

2

Like the docs for re.sub says, the function you supply will receive a match object.

to get the URL (the full match), use .group(0) on it, like this:

import re

old_msg = 'This is an url https://ebay.to/3bxNNfj e this another one https://amzn.to/2QBsX7t'

def manipulate_url(match):
    url = match.group(0)
    #example of manipulation, in real i get query replacement tags and other complex....
    if 'ebay' in url:
        new_url = url + "/another/path/"
    if 'amzn' in url:
        new_url = url + "/lalala/path/"
    return new_url

result = re.sub('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', manipulate_url, old_msg)
print(result)

Output:

This is an url https://ebay.to/3bxNNfj/another/path/ e this another one https://amzn.to/2QBsX7t/lalala/path/

Sign up to request clarification or add additional context in comments.

2 Comments

thank you for explanation! it works as expected. btw....my pattern to recognize url work but...it there someelse better or i can use this? thanks a lot
Looks like the standard URL regex to me, I don't know any better way

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.