0

Im trying to have python extract some text out of URL string.

Here's example of URL https://somewebsite/images/products/SkuName/genricFileName.jpg

The SkuName always will come after the 5th "/" and will end by the 6th "/"

I would like to extract 'SkuName'

import urllib.request

images = input('please enter url list separated by ","')
names = input('please enter images names separated by ","')

images = images.split(',')
names =  names.split(',')

for index, image in enumerate(images):
    urllib.request.urlretrieve(image, "images/{}.jpg".format(names[index])) 
print('images downloaded successfully')   

As you can see, the user have to manually enter the SKU Name (which goes under variable 'names')

I would like the user to enter only one input (URL) and python automatically extract the SKUName out of the URL string

Thanks!

4 Answers 4

1

If you're sure that the (absolute) position of the name in the URL won't change, then url.split('/')[5] should solve your problem.

Sign up to request clarification or add additional context in comments.

1 Comment

I'm not sure how to implamnt that in mycode.. for index, image in enumerate(images): urllib.request.urlretrieve(image, "temp/{}.jpg".format(image.split('/')[:-2]))
1

You can do it using python regex. Note: change the pattern as per your url

import re
url = 'https://somewebsite/images/products/SkuName/genricFileName.jpg'
pattern = re.compile(r'(?<=(https://somewebsite/images/products/)).*(?=/genricFileName.jpg)', re.I)
sku_name = pattern.search(url).group()

Comments

0

If that format is fix you can just split the url and access the second last element from the resulting list:

url = "https://somewebsite/images/products/SkuName/genricFileName.jpg"
skuName = url.split("/")[-2]

Comments

0

You seem to be aware of the split function already. You can use that, in combination with slicing to get you what you need.

skuName = input('url').split('/')[:-2]

This will yield the second to last element in the list. You could also search for the the 6th element by using.

skuName = input('url').split('/')[5]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.