0

I am trying to retrieve the last img scr of a web page doing webscraping with BeautifulSoup. So far I am trying to use a selector but it if impossible for me to find anything after the ::before selector.

The basic code is:

import requests
from bs4 import BeautifulSoup

s = requests.session()
r = s.get("https://www.immobiliare.it/vendita-case/milano/forlanini/?criterio=dataModifica&ordine=desc")

soup = BeautifulSoup(r.content, "lxml")

for property in soup.find_all("li", {"class": "nd-list__item in-realEstateResults__item"}):

The HTML code of the page has the following structure:

Each li class="nd-list__item in-realEstateResults__item" is a property I want to extract the img src from.

Each li class="nd-list__item in-realEstateResults__item" is a property I want to extract the img src from

Bear in mind that the first image has an easier html code, I cannot get the src from the rest of them

8
  • The html code is not valid. The class attribute does not close in line 6 nor does it in all subsequent div. Even the div elements do not close correctly. Commented Jan 5, 2023 at 12:23
  • If you know it's the last image, why not use soup.find_all('img')[-1]['src']? Commented Jan 5, 2023 at 12:26
  • It is a example code, the html code is much more larger in complex, however this part have this exactly structure. The soup.find_all('img') does not find that image. Commented Jan 5, 2023 at 12:34
  • you lost closing quotes in class nd-slideshow__item Commented Jan 5, 2023 at 12:42
  • 1
    The img <> does not appear in the soup, this is why I think i need to select more thing. Commented Jan 5, 2023 at 14:39

3 Answers 3

3

EDIT

I have to correct my initial statement:

The use of the rather sluggish selenium is not absolutely necessary and it is also possible to implement it using requests and beautifulsoup.

On closer inspection, it turned out that all the information can be found in a <script>. Its content can be extracted and used as JSON, and the urls of the maps have to be assembled based on the location information.

Example

import requests, json, time
from bs4 import BeautifulSoup

data = []

url = f'https://www.immobiliare.it/vendita-case/milano/forlanini/?criterio=dataModifica&ordine=desc'

while True:

    jsonData = json.loads(
        BeautifulSoup(
            requests.get(url).text
        ).select_one('#__NEXT_DATA__').text
    )['props']['pageProps']['dehydratedState']['queries'][0]['state']['data']['pages'][0]



    for e in jsonData['results']:
        l = e['realEstate']['properties'][0]['location']
        data.append({
            'id':e['realEstate']['id'],
            'map':f"https://maps.im-cdn.it/static?zoom=15&size=360x270&language=it&style=feature%3Aroad%7Celement%3Alabels%7Cvisibility%3Aoff&sensor=false&markers=icon%3Ahttps%3A%2F%2Fs1.immobiliare.it%2F_next%2Fstatic%2Fmedia%2Fmap-marker.27fc2b6f.png%7C{l['latitude']}%2C{l['longitude']}&center={l['latitude']}%2C{l['longitude']}"
        })

    print(f"scraping page: {jsonData['currentPage']}")

    if jsonData['maxPages'] != jsonData['currentPage']:
        url = f"https://www.immobiliare.it/vendita-case/milano/forlanini/?criterio=dataModifica&ordine=desc&pag={jsonData['currentPage']+1}"
    else:
        break

    time.sleep(1)

data

Older Answer

As mentioned in the comments, content is rendered dynamically, so you will not get the expected result in used combination of requests, that will not render JS, like a browser will do, and BeautifulSoup that won`t find your expected elements, cause they are not there.

Just to clarify ::before is a pseudo-element:

In CSS, ::before creates a pseudo-element that is the first child of the selected element. It is often used to add cosmetic content to an element with the content property. It is inline by default.


You could go with requests if you will use an api, some information comes from:

s.get('https://www.immobiliare.it/api-next/agencies/local-expert/?city-id=8042&province-id=MI&macrozone-id[0]=10294&limit=25&output=json').json()

->  {'agencies': [{'id': 9681, 'displayName': 'Fonte Immobiliare Città Studi 2', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/934856533.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/934856531.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/9681/fonte-citta-studi--milano/', 'address': 'Via Giovanni Briosi 10 20133 - Milano', 'bannerImage': 'https://pic.im-cdn.it/image/934857363/xs-c.jpg', 'externalId': None, 'timeContract': 11, 'paid': True}, {'id': 83565, 'displayName': 'Cfc Immobiliare ', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/244109821.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/244109817.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/83565/cfc-milano/', 'address': 'Via Carnia 7 20132 - Milano', 'bannerImage': 'https://maps.im-cdn.it/static?center=45.492900,9.236340&zoom=15&size=400x230&markers=45.492900,9.236340', 'externalId': None, 'timeContract': 10, 'paid': True}, {'id': 208668, 'displayName': 'YOUR HOME - Real Estate', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/1127494478.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/1127494476.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/208668/your-home-milano/', 'address': 'Bastioni Porta Nuova 21 20121 - Milano', 'bannerImage': 'https://maps.im-cdn.it/static?center=45.480100,9.188150&zoom=15&size=400x230&markers=45.480100,9.188150', 'externalId': None, 'timeContract': 7, 'paid': True}, {'id': 231505, 'displayName': 'Homepanda', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/693409659.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/693409657.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/231505/homepanda/', 'address': 'Via Gian Giacomo Mora 20 20123 - Milano', 'bannerImage': 'https://maps.im-cdn.it/static?center=45.458900,9.179330&zoom=15&size=400x230&markers=45.458900,9.179330', 'externalId': None, 'timeContract': 4, 'paid': True}, {'id': 118081, 'displayName': 'CONSULOVEST  CORBETTA Via Meroni 2 - MILANO V.le San Gimignano 8', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/1162882814.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/1162882812.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/118081/consulovest-corbetta/', 'address': 'Via Meroni 2 20011 - Corbetta', 'bannerImage': 'https://pic.im-cdn.it/image/1162882818/xs-c.jpg', 'externalId': None, 'timeContract': None, 'paid': False}, {'id': 5272, 'displayName': 'Arena Immobiliare S.R.L.', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/936162495.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/936162493.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/5272/arena-milano/', 'address': 'Via Marco Bruto 9 20138 - Milano', 'bannerImage': 'https://maps.im-cdn.it/static?center=45.459800,9.238870&zoom=15&size=400x230&markers=45.459800,9.238870', 'externalId': None, 'timeContract': 21, 'paid': False}, {'id': 32741, 'displayName': 'Studio emme3 ', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/196647202.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/196647201.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/32741/studio-emme-milano/', 'address': 'Via Pompeo Neri 2 20146 - Milano', 'bannerImage': 'https://maps.im-cdn.it/static?center=45.456800,9.143770&zoom=15&size=400x230&markers=45.456800,9.143770', 'externalId': None, 'timeContract': 4, 'paid': False}, {'id': 242120, 'displayName': 'Levia SRL', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/843934046.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/843934044.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/242120/levia-milano/', 'address': 'Viale Ungheria 20 20138 - Milano', 'bannerImage': 'https://maps.im-cdn.it/static?center=45.445700,9.246040&zoom=15&size=400x230&markers=45.445700,9.246040', 'externalId': None, 'timeContract': 3, 'paid': False}, {'id': 396994, 'displayName': 'Affiliato Tecnorete: STUDIO IMMOBILIARE CORSICA SRL', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/1247888668.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/1247888664.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/396994/tecnorete-milano-viale-ungheria/', 'address': 'Viale Ungheria 24 20135 - Milano', 'bannerImage': 'https://maps.im-cdn.it/static?center=45.445500,9.246760&zoom=15&size=400x230&markers=45.445500,9.246760', 'externalId': None, 'timeContract': 0, 'paid': False}, {'id': 140950, 'displayName': 'Abitare Agency Srl', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/1135165888.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/1135165886.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/140950/abitare-agency/', 'address': 'Via Voghera 7 20144 - Milano', 'bannerImage': 'https://pic.im-cdn.it/image/1135165932/xs-c.jpg', 'externalId': None, 'timeContract': 10, 'paid': False}, {'id': 94305, 'displayName': 'Affiliato Tecnocasa: IMMOBILIARE MARGOT SRLU', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/1135591154.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/1135591152.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/94305/tecnocasa-milano-via-mecenate/', 'address': 'Via Mecenate 4 20138 - Milano', 'bannerImage': 'https://maps.im-cdn.it/static?center=45.457400,9.242440&zoom=15&size=400x230&markers=45.457400,9.242440', 'externalId': None, 'timeContract': 8, 'paid': False}, {'id': 241224, 'displayName': 'INVIMIT SGR SpA', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/829818360.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/829818358.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/241224/invimit-roma/', 'address': 'Via di Santa Maria in Via 12 00187 - Roma', 'bannerImage': 'https://pic.im-cdn.it/image/825106468/xs-c.jpg', 'externalId': None, 'timeContract': 3, 'paid': False}, {'id': 209778, 'displayName': 'STUDIO6ERRE - Sede Milano', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/937464013.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/937464011.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/209778/studioerre-milano/', 'address': 'Viale Abruzzi 80 20131 - Milano', 'bannerImage': 'https://maps.im-cdn.it/static?center=45.483700,9.217150&zoom=15&size=400x230&markers=45.483700,9.217150', 'externalId': None, 'timeContract': 7, 'paid': False}, {'id': 166328, 'displayName': 'HB ADVISORY', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/311765482.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/311765478.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/166328/hb-advisory/', 'address': 'Corso Buenos Aires 60 20124 - Milano', 'bannerImage': 'https://maps.im-cdn.it/static?center=45.482200,9.212750&zoom=15&size=400x230&markers=45.482200,9.212750', 'externalId': None, 'timeContract': 9, 'paid': False}, {'id': 41477, 'displayName': 'StudioZimer', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/1143272706.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/1143272704.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/41477/studiozimer/', 'address': 'CORSO LODI 111 20135 - Milano', 'bannerImage': 'https://maps.im-cdn.it/static?center=45.441500,9.221210&zoom=15&size=400x230&markers=45.441500,9.221210', 'externalId': None, 'timeContract': 4, 'paid': False}, {'id': 386016, 'displayName': 'STUDIO ASTE MC', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/1250133574.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/1250133572.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/386016/studio-aste-mc-sesto-san-giovanni/', 'address': 'Via Carlo Cattaneo 49 20099 - Sesto San Giovanni', 'bannerImage': 'https://pic.im-cdn.it/image/1147591370/xs-c.jpg', 'externalId': None, 'timeContract': None, 'paid': False}, {'id': 42941, 'displayName': 'OBIETTIVOCASA', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/155958486.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/155958482.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/42941/obiettivocasa-milano-via-pordenone/', 'address': 'via pordenone 13 20132 - Milano', 'bannerImage': 'https://maps.im-cdn.it/static?center=45.490300,9.234840&zoom=15&size=400x230&markers=45.490300,9.234840', 'externalId': None, 'timeContract': 10, 'paid': False}, {'id': 392582, 'displayName': 'AsteGlobal', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/1227120896.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/1227120894.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/392582/asteblobal/', 'address': 'via Reali 13 20037 - Paderno Dugnano', 'bannerImage': 'https://pic.im-cdn.it/image/1227121058/xs-c.jpg', 'externalId': None, 'timeContract': None, 'paid': False}, {'id': 203747, 'displayName': 'Le case di Patty', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/811914140.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/811914138.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/203747/le-case-di-patty-milano/', 'address': 'Via Montebello 14 20121 - Milano', 'bannerImage': 'https://maps.im-cdn.it/static?center=45.475200,9.189070&zoom=15&size=400x230&markers=45.475200,9.189070', 'externalId': None, 'timeContract': 7, 'paid': False}, {'id': 35498, 'displayName': "Expo'  Servizi  Immobiliari", 'imageUrls': [], 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/35498/expo/', 'address': 'Viale Premuda 21 20129 - Milano', 'bannerImage': 'https://maps.im-cdn.it/static?center=45.466000,9.207020&zoom=15&size=400x230&markers=45.466000,9.207020', 'externalId': None, 'timeContract': 5, 'paid': False}, {'id': 228450, 'displayName': 'Aste Milano Immobiliare', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/1061930169.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/1061930167.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/228450/aste-rozzano/', 'address': 'Via Innocenzo Isimbardi 29 20141 - Milano', 'bannerImage': 'https://maps.im-cdn.it/static?center=45.435200,9.180890&zoom=15&size=400x230&markers=45.435200,9.180890', 'externalId': None, 'timeContract': 5, 'paid': False}, {'id': 5350, 'displayName': 'IMI immobiliare Milano - Partner Navigli', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/424092179.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/424092177.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/5350/imi-milano-navigli/', 'address': 'Via Conchetta 2 20136 - Milano', 'bannerImage': 'https://maps.im-cdn.it/static?center=45.446800,9.179270&zoom=15&size=400x230&markers=45.446800,9.179270', 'externalId': None, 'timeContract': 12, 'paid': False}, {'id': 237934, 'displayName': 'ASTA4YOU', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/1112140922.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/1112140920.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/237934/astayou/', 'address': 'Via Domenico Cimarosa 26 20144 - Milano', 'bannerImage': 'https://maps.im-cdn.it/static?center=45.464200,9.157880&zoom=15&size=400x230&markers=45.464200,9.157880', 'externalId': None, 'timeContract': 3, 'paid': False}, {'id': 28201, 'displayName': 'Meta Immobiliare - Massimo Valore Certificato', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/962490064.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/962490062.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/28201/meta-san-donato/', 'address': 'Via Alfonsine 34 20097 - San Donato Milanese', 'bannerImage': 'https://pic.im-cdn.it/image/854995656/xs-c.jpg', 'externalId': None, 'timeContract': 9, 'paid': False}, {'id': 3986, 'displayName': 'TREC s.a.s', 'imageUrls': {'large': 'https://pic.im-cdn.it/imagenoresize/1040856624.jpg', 'small': 'https://pic.im-cdn.it/imagenoresize/1040856622.jpg'}, 'agencyUrl': 'https://www.immobiliare.it/agenzie-immobiliari/3986/tre-c/', 'address': 'Via Negroli 49 20133 - Milano', 'bannerImage': 'https://maps.im-cdn.it/static?center=45.467200,9.232320&zoom=15&size=400x230&markers=45.467200,9.232320', 'externalId': None, 'timeContract': 1, 'paid': False}], 'searchAgencyUrl': 'http://www.immobiliare.it/agenzie-immobiliari/milano/?idMZona[]=10294'}

or with selenium to mimic a browser and work on the rendered driver.page_source.

Example

from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))

url = 'https://www.immobiliare.it/vendita-case/milano/forlanini/?criterio=dataModifica&ordine=desc'
driver.get(url)

soup = BeautifulSoup(driver.page_source)

data = []
for e in soup.select('li.in-realEstateResults__item'):
    data.append({
        'title':e.a.get('title'),
        'imgUrls':[i.get('src') for i in e.select('.nd-list__item img')],
        'imgMapInfo': e.select_one('[alt="mappa"]').get('src') if e.select_one('[alt="mappa"]') else None
    })

data

Output

[{'title': 'Bilocale buono stato, primo piano, Viale Ungheria - Mecenate, Milano', 'imgUrls': ['https://pwm.im-cdn.it/image/1261450576/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261450580/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261702222/xxs-c.jpg', 'https://maps.im-cdn.it/static?zoom=15&size=360x270&language=it&style=feature%3Aroad%7Celement%3Alabels%7Cvisibility%3Aoff&sensor=false&markers=icon%3Ahttps%3A%2F%2Fs1.immobiliare.it%2F_next%2Fstatic%2Fmedia%2Fmap-marker.27fc2b6f.png%7C45.4565%2C9.2427&center=45.4565%2C9.2427', 'https://pic.im-cdn.it/imagenoresize/875151762.jpg'], 'imgMapInfo': 'https://maps.im-cdn.it/static?zoom=15&size=360x270&language=it&style=feature%3Aroad%7Celement%3Alabels%7Cvisibility%3Aoff&sensor=false&markers=icon%3Ahttps%3A%2F%2Fs1.immobiliare.it%2F_next%2Fstatic%2Fmedia%2Fmap-marker.27fc2b6f.png%7C45.4565%2C9.2427&center=45.4565%2C9.2427'}, {'title': 'Bilocale via Romualdo Bonfadini 82, Viale Ungheria - Mecenate, Milano', 'imgUrls': ['https://pwm.im-cdn.it/image/1261689706/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689762/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689770/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689736/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689806/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689780/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689794/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689744/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689718/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689728/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689628/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689636/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689752/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689674/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689694/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689680/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689690/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689670/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689652/xxs-c.jpg', 'https://pwm.im-cdn.it/image/1261689816/xxs-c.jpg', 'https://maps.im-cdn.it/static?zoom=15&size=360x270&language=it&style=feature%3Aroad%7Celement%3Alabels%7Cvisibility%3Aoff&sensor=false&markers=icon%3Ahttps%3A%2F%2Fs1.immobiliare.it%2F_next%2Fstatic%2Fmedia%2Fmap-marker.27fc2b6f.png%7C45.4442%2C9.2417&center=45.4442%2C9.2417', 'https://pic.im-cdn.it/imagenoresize/949757836.jpg'], 'imgMapInfo': 'https://maps.im-cdn.it/static?zoom=15&size=360x270&language=it&style=feature%3Aroad%7Celement%3Alabels%7Cvisibility%3Aoff&sensor=false&markers=icon%3Ahttps%3A%2F%2Fs1.immobiliare.it%2F_next%2Fstatic%2Fmedia%2Fmap-marker.27fc2b6f.png%7C45.4442%2C9.2417&center=45.4442%2C9.2417'}, {'title': 'Appartamento via Oreste Salomone, Viale Ungheria - Mecenate, Milano', 'imgUrls': ['https://pwm.im-cdn.it/image/1256189648/xxs-c.jpg', 'https://pic.im-cdn.it/imagenoresize/994952108.jpg'], 'imgMapInfo': None},...]
Sign up to request clarification or add additional context in comments.

1 Comment

Answered your question with this specific focus here: stackoverflow.com/a/75059884/14460824
0
from bs4 import BeautifulSoup

html = """
<div class="in-mediaContent">
   <div class="nd-figure in-photo in-Card__photo--big">
       <div class="nd-figure__image nd-ratio">
           ::before
          <div class="nd-slideshow nd-slideshow--small">
              <div class="nd-slideshow__content>
              </div>
                  <div class="nd-slideshow__item
                  </div>
                  <div class="nd-slideshow__item
                  </div>
                  <div class="nd-slideshow__desired_item
                      <img src =”desired link”>
                 </div>
               </div>
            </div>
       </div>
   </div>"""

soup = BeautifulSoup(html, 'html.parser')

r = soup.select('div[class*="nd-slideshow"]')
print(r)

in result html after ::before

[<div class="nd-slideshow nd-slideshow--small">
<div <="" class="nd-slideshow__content&gt; &lt;/div&gt; &lt;div class=" div="" nd-slideshow__item="">
<div <img="" class="nd-slideshow__item &lt;/div&gt; &lt;div class=" link”="" nd-slideshow__desired_item="" src="”desired">
</div>
</div>
</div>, <div <="" class="nd-slideshow__content&gt; &lt;/div&gt; &lt;div class=" div="" nd-slideshow__item="">
<div <img="" class="nd-slideshow__item &lt;/div&gt; &lt;div class=" link”="" nd-slideshow__desired_item="" src="”desired">
</div>
</div>, <div <img="" class="nd-slideshow__item &lt;/div&gt; &lt;div class=" link”="" nd-slideshow__desired_item="" src="”desired">
</div>]

5 Comments

It does not work for me, added the real url and html code in the thread
what url? we need more code to reproduce problem. is it dynamic page? maybe your thread doesn't wait while element appears?
as you can see problem is not in html code, but lost quotes bothers me
Specifically I can retrieve the img src from the first property but the rest of them have a different html structure.
0

Based on your screenshot, I searched the div element that has the "nd-slideshow__item in-realEstateListCard__mapInfo" class and then I could get the image inside the "div" element.

With this idea, I've modified your code as follows:

import requests
from bs4 import BeautifulSoup
  
url = "https://www.immobiliare.it/vendita-case/milano/forlanini/?criterio=dataModifica&ordine=desc"
page = requests.get(url)
soup = BeautifulSoup(page.content, "lxml")

# The image you want is inside a img HTML element which is contained inside a "div" element: 
div_element = soup.find_all("div", class_="nd-slideshow__item in-realEstateListCard__mapInfo")

# Print the "src" value of the img HTML element found on the div
print(div_element[0].find("img")["src"])

And this is the result I got:

https://maps.im-cdn.it/static?zoom=15&size=360x270&language=it&style=feature%3Aroad%7Celement%3Alabels%7Cvisibility%3Aoff&sensor=false&markers=icon%3Ahttps%3A%2F%2Fs1.immobiliare.it%2F_next%2Fstatic%2Fmedia%2Fmap-marker.27fc2b6f.png%7C45.4565%2C9.2427&center=45.4565%2C9.2427

1 Comment

Thank you very much, but as I said this would only pick the src for the first property and not for the rest.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.