1

I have a string and this string could contain some occurrences like:

http://site/image.jpg

What is the right way to replace, when it's present, this kind of occurrence by

<img src="http://site/image.jpg">

What is really important, is to replace only occurrences beginning by http and which ending by .jpg, .png and gif by the <img> HTML tag.

So if among all the text there is a URL link of any images, it's formatted by a HTML tag to display the image.

1
  • You should look into regular expressions which can be used to work with strings Commented Aug 9, 2017 at 12:44

2 Answers 2

1

Pretty straightforward with regex:

import re

string = 'some other text, a URL http://site/image.jpg and other text'

print(re.sub(r'(https?.+?(?:jpg|png|gif))', r'<img src="\1">', string))

# some other text, a URL <img src="http://site/image.jpg"> and other text

(https?.+(?:jpg|png|gif)) matches everything that starts with http or https and ending with jpg, png or gif.

'<img src="\1">' here the \1 refers to the first (and only) capture group in the previous regex (which contains the image url).

Sign up to request clarification or add additional context in comments.

5 Comments

Unfortunately your regexp matches literally everything that starts with http and ending with .jpg/png/gif, so if there is more than one link separated by some text - this text also will be captured. Please check example: regex101.com/r/eYVuXF/1
@erhesto Ok, fixed. Was only a matter of changing .+ to the lazy .+? regex101.com/r/eYVuXF/2
Still not totally comfortable situation, as things like in example: regex101.com/r/eYVuXF/3 are captured, but maybe it will be enough for OP, so I'm removing downvote.
@erhesto Well, OP did not ask for a url verifier.
Thanks everyone for your help, I think to get a solution by AJAX/JQUERY
1

This is a simple answer to your question:

def check_if_image(url, image_extensions):
   if url.startswith("https://") or url.startswith("http://"):
       for extension in image_extensions:
           if(extension in url[-4:]):
               return True
   return False

def main():
   url_seed = ["http://somesite.com/img1.jpg", "https://somesite2.com/img2.gif", 
            "http://somesite3.net/img3.png", "http://noimagesite.com/noimage"]
   image_extensions = [".jpg", ".png", ".gif"]

   final_result=[]
   for site in url_seed:
       if check_if_image(site, image_extensions):
           final_result.append('<img src="%s">' %site)
   print(final_result)

This includes "http" and "https" site verification, as well as code working for 3 character image extensions, such as you asked: jpg, gif and png.

Hope it helped. Feel free to ask if you have any question.

Edit: Didn't notice you had not the urls already in a data structure, so this is invalid to your situation

6 Comments

You can use startswith(http) or startswith(https) in check_if_image. It's faster than a regex and it's a good way to.
Thank you for the feedback. Since I'm switching languages a lot I don't always know the built-ins of them. Will take note :)
Python is great language. But regex is little bit slower than built-in function. But sometime you can take an other way than regex :D
Already updated my answer according to your tip, even if it's not the solution for this answer, but might help someone else with a similar question when googling. Thank you
My URLS are in a string like "Hi, my name is Clement and my avatar is image.jpg and there is another picture : image2.png.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.