2

I having trouble with finding some data on this page. I need the link of the main image, and the sub images. I also need the text under the 2 tabs "Ingridients et allergens" and "Mode d'empoli et conservation". It seems to me that these are iframes (or the same iframe) but whatever I tried returns an error. Help will be much appreciated.

Thanks in advance

Edit: Here is an example of a non-working code:

browser = webdriver.Firefox()
link = 'https://naturalia.fr/sardines-naturel-95g'
browser.get(link)

try:
    browser.find_element_by_xpath('''//*[@id="tab-label-ingredients_info-title"]''').click()
    descr = browser.find_element_by_class_name('cms-content')
    print('Description2: {}'.format(descr.text))
except Exception as e:
    print(e)

try:
    main_img = browser.find_element_by_xpath('''//*[@id="maincontent"]/div[2]/div[2]/div[2]/div/div[2]/div[2]/div[1]/div[3]/div[1]/img''').get_attribute('src')
    print(main_img)
except Exception as e:
    print(e)
2
  • 1
    Can you post what exactly you have tried so we can help going from there? Commented Aug 12, 2017 at 17:54
  • For example I tried finding the image by class name - "fotorama__stage" or even "fotorama__stage__frame fotorama_vertical_ratio fotorama__loaded fotorama__loaded--img magnify-wheel-loaded fotorama__active". I also tried xpath by using Chrome's "copy XPATH" option - but all of these return an exception. Commented Aug 12, 2017 at 18:03

1 Answer 1

2

You can find the image url using the xpath to the img tag and then accessing it's src attribute:

>>> driver.find_element_by_xpath('''//*[@id="maincontent"]/div[2]/div[2]/div[2]/div/div[2]/div[2]/div[1]/div[3]/div[1]/img''').get_attribute('src')
'https://naturalia.fr/media/catalog/product/cache/image/368x414/e9c3970ab036de70892d86c6d221abfe/3/2/3263670138016.1-0001.jpg'

For the text under the tabs, first click on them and then proceed to extract the text finding the class "cms-content":

>>> driver.find_element_by_xpath('''//*[@id="tab-label-ingredients_info-title"]''').click()
>>> mytext = driver.find_element_by_class_name("cms-content").text
>>> print(mytext)

Sardines, eau, citron* (pulpe, zeste et jus), sel de mer, thym*, fenouil*, persil*, laurier*.
*3.5% des ingrédients d'origine agricole sont issus de l’agriculture biologique certifié par FR BIO 10
Valeurs nutritionnelles moyennes Pour 100g
Energie 136 Kcal / 572 KJ
Matières grasses 4,9 g
Dont acides gras
(......)

You may also use the class to extract all the images links:

images = driver.find_elements_by_class_name("fotorama__img")
links = [image.get_attribute('src') for image in images]

>>> links
['https://naturalia.fr/media/catalog/product/cache/image/368x414/e9c3970ab036de70892d86c6d221abfe/3/2/3263670138016.1-0001.jpg', 'https://naturalia.fr/media/catalog/product/cache/image/368x414/e9c3970ab036de70892d86c6d221abfe/3/2/3263670138016.8-0001.jpg', 'https://naturalia.fr/media/catalog/product/cache/thumbnail/84x84/beff4985b56e3afdbeabfc89641a4582/3/2/3263670138016.8-0001.jpg', 'https://naturalia.fr/media/catalog/product/cache/thumbnail/84x84/beff4985b56e3afdbeabfc89641a4582/3/2/3263670138016.1-0001.jpg']
Sign up to request clarification or add additional context in comments.

10 Comments

For the image, I'm getting this error: Unable to locate element: //*[@id="maincontent"]/div[2]/div[2]/div[2]/div/div[2]/div[2]/div[1]/div[3]/div[1]/img. Also, is it a good idea to use xpath here? I need to do this action for many more pages on this site.
About the text, I tried the same thing but with find_by_id' - why isn't it working? Also not working with xpath...
@DavidRotenberg Please try my second solution, using class.
@DavidRotenberg "I tried the same thing but with find_by_id" - You'd have to post the entire code for me to understand why it's not working, otherwise I won't be able to help
I tired it, not working... I'm using Firefox if that matter
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.