Remove an element in a container using selenium

Question

I only want to scrape the required information contained in the black box, and delete/remove/exclude the information contained in the red box

I am doing this because class names "entry" and "partial entry" exist in both boxes. Only the first "partial entry" contains the information that I need, so I plan to delete/remove/exclude the classname "mgrRspnInLine".

My code is:

while True:
    container = driver.find_elements_by_xpath('.//*[contains(@class,"review-container")]')
    for item in container:
        try:
            element = item.find_element_by_class_name('mgrRspnInline')
            driver.execute_script("""var element = document.getElementsByClassName("mgrRspnInline")[0];element.parentNode.removeChild(element);""", element)
            WebDriverWait(driver, 50).until(EC.presence_of_element_located((By.XPATH,'.//*[contains(@class,"taLnk ulBlueLinks")]')))
            element = WebDriverWait(driver, 50).until(EC.element_to_be_clickable((By.XPATH,'.//*[contains(@class,"taLnk ulBlueLinks")]')))
            element.click()
            time.sleep(2)
            rating = item.find_elements_by_xpath('.//*[contains(@class,"ui_bubble_rating bubble_")]')
            for rate in rating:
                rate = rate.get_attribute("class")
                rate = str(rate)
                rate = rate[-2:]
                score_list.append(rate)
            time.sleep(2)
            stay = item.find_elements_by_xpath('.//*[contains(@class,"recommend-titleInline noRatings")]')
            for stayed in stay:
                stayed = stayed.text
                stayed = stayed.split(', ')
                stayed.append(stayed[0])
                travel_type.append(stayed[1])
            WebDriverWait(driver, 50).until(EC.presence_of_element_located((By.XPATH,'.//*[contains(@class,"noQuotes")]')))
            summary = item.find_elements_by_xpath('.//*[contains(@class,"noQuotes")]')
            for comment in summary:
                comment = comment.text
                comments.append(comment)
            WebDriverWait(driver, 50).until(EC.presence_of_element_located((By.XPATH,'.//*[contains(@class,"ratingDate")]')))
            rating_date = item.find_elements_by_xpath('.//*[contains(@class,"ratingDate")]')
            for date in rating_date:
                date = date.get_attribute("title")
                date = str(date)
                review_date.append(date)
            WebDriverWait(driver, 50).until(EC.presence_of_element_located((By.XPATH,'.//*[contains(@class,"partial_entry")]')))
            review = item.find_elements_by_xpath('.//*[contains(@class,"partial_entry")]')
            for comment in review:
                comment = comment.text
                print(comment)
                reviews.append(comment)
        except (NoSuchElementException) as e:
            continue
    try:
        element = WebDriverWait(driver, 100).until(EC.element_to_be_clickable((By.XPATH,'.//*[contains(@class,"nav next taLnk ui_button primary")]')))
        element.click()
        time.sleep(2)
    except (ElementClickInterceptedException,NoSuchElementException) as e:
        print(e)
        break

Basically within the "review-container" I searched first for the class name "mgrRspnInLine", then tried to delete it using the execute_script.

but unfortunately, the output still shows the contents contained in the"mgrRspnInLine".

Your code for removing element should work. There might be several elements with class name mgrRspnInLine (hidden?), so probably you're removing the wrong element... You can simplify your code to driver.execute_script("""arguments[0].parentNode.removeChild(arguments[0]);""", element) — Andersson
– Andersson, Commented Nov 19, 2018 at 11:45

Andersson · Accepted Answer · 2018-11-19 11:32:40Z

2

If you want to avoid matching second element by your XPath you can just modify XPath as below:

.//*[contains(@class,"partial_entry") and not(ancestor::*[@class="mgrRspnInLine"])]

This will match element with class name "partial_entry" only if it doesn't have ancestor with class name "mgrRspnInLine"

answered Nov 19, 2018 at 11:32

Andersson

52.8k18 gold badges83 silver badges132 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

SIM Over a year ago

Awesome expression @sir Andersson. Always something new to learn.

QHarr · Accepted Answer · 2018-11-19 12:22:10Z

0

If you want the first occurrence you could use css class selector instead of:

.partial_entry

and retrieve with find_element_by_css_selector:

find_element_by_css_selector(".partial_entry")

answered Nov 19, 2018 at 12:22

QHarr

84.5k14 gold badges58 silver badges105 bronze badges

Comments

pguardiario · Accepted Answer · 2018-11-19 22:49:28Z

0

You can delete all the .mgrRspnInLine elements with:

driver.execute_script("[...document.querySelectorAll('.mgrRspnInLine')].map(el => el.parentNode.removeChild(el))")

answered Nov 19, 2018 at 22:49

pguardiario

55.2k21 gold badges130 silver badges169 bronze badges

Comments

Gerard · Accepted Answer · 2018-11-20 05:10:54Z

Stitching the comment by Andersson, and the two answers provided by QHarr, and pguardiario. I finally solved the problem.

The key is to target a container within the container, all the information is contained in the class name "ui_column is-9" which is contained in the class name "review-container", hence addressing Andersson's comment of multiple mgrRspnInLine.

Within the nested loop, I used pguardianrio's suggestion to delete existing multiple mgrRspnInLine, then adding QHarr's answer on .partial_entry

while True:
    container = driver.find_elements_by_xpath('.//*[contains(@class,"review-container")]')
    for items in container:
        element = WebDriverWait(driver, 1000).until(EC.element_to_be_clickable((By.XPATH,'.//*[contains(@class,"taLnk ulBlueLinks")]')))
        element.click()
        time.sleep(10)
        contained = items.find_elements_by_xpath('.//*[contains(@class,"ui_column is-9")]')
        for item in contained:
            try:
                driver.execute_script("[...document.querySelectorAll('.mgrRspnInLine')].map(el => el.parentNode.removeChild(el))")
                rating = item.find_element_by_xpath('//*[contains(@class,"ui_bubble_rating bubble_")]')
                rate = rating .get_attribute("class")
                rate = str(rate)
                rate = rate[-2:]
                score_list.append(rate)
                time.sleep(2)
                stay = item.find_element_by_xpath('.//*[contains(@class,"recommend-titleInline")]')
                stayed = stay.text
                stayed = stayed.split(', ')
                stayed.append(stayed[0])
                travel_type.append(stayed[1])
                WebDriverWait(driver, 50).until(EC.presence_of_element_located((By.XPATH,'.//*[contains(@class,"noQuotes")]')))
                summary = item.find_element_by_xpath('.//*[contains(@class,"noQuotes")]')
                comment = summary.text
                comments.append(comment)
                WebDriverWait(driver, 50).until(EC.presence_of_element_located((By.XPATH,'.//*[contains(@class,"ratingDate")]')))
                rating_date = item.find_element_by_xpath('.//*[contains(@class,"ratingDate")]')
                date = rating_date.get_attribute("title")
                date = str(date)
                review_date.append(date)
                WebDriverWait(driver, 50).until(EC.presence_of_element_located((By.XPATH,'.//*[contains(@class,"partial_entry")]')))
                review = item.find_element_by_css_selector(".partial_entry")
                comment = review.text
                print(comment)
            except (NoSuchElementException) as e:
                continue
    try:
        element = WebDriverWait(driver, 100).until(EC.element_to_be_clickable((By.XPATH,'.//*[contains(@class,"nav next taLnk ui_button primary")]')))
        element.click()
        time.sleep(2)
    except (ElementClickInterceptedException,NoSuchElementException) as e:
        print(e)
        break

Collectives™ on Stack Overflow

Remove an element in a container using selenium

4 Answers 4

1 Comment

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related