0

this code when given a list of cities goes and searches on google and extract data then covert it into a dataframe

In some cases have to use different xpaths to extract the data. there are three xpaths in total.

Trying to do this :

if

   1 doesnt work go to 2 

   2 doesnt work go to 3

   3 doesnt work. 

use driver.quit ()

tried this code used NoSuchElementException

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions
import pandas as pd
from selenium.common.exceptions import NoSuchElementException

df_output = pd.DataFrame(columns=["City", "pincode"])
url = "https://www.google.com/"
chromedriver = ('/home/me/chromedriver/chromedriver.exe')
driver = webdriver.Chrome(chromedriver)
driver.implicitly_wait(30)
driver.get(url)
search = driver.find_element_by_name('q')

mlist1=['polasa']
for i in mlist1:

    try:
        search.send_keys(i,' pincode')
        search.send_keys(Keys.RETURN)
        WebDriverWait(driver, 10).until(expected_conditions.visibility_of_element_located((By.XPATH, '//div[@class="IAznY"]//div[@class="title"]')))
        elmts = driver.find_elements_by_xpath('//div[@class="IAznY"]//div[@class="title"]')
        df_output = df_output.append(pd.DataFrame(columns=["City", "pincode"], data=[[i,elmts[0].text]])) 
        driver.quit()

    except NoSuchElementException:

        try:
            elements=driver.find_element_by_xpath("//div[@class='Z0LcW']")
            df_output = df_output.append(pd.DataFrame(columns=["City", "pincode"], data=[[i,elements.text]]))
            driver.quit()

         except NoSuchElementException:

                try:
                    elements=driver.find_element_by_xpath("//div[@class='Z0LcW AZCkJd']")
                    df_output = df_output.append(pd.DataFrame(columns=["City", "pincode"], data=[[i,elements.text]]))
                    driver.quit()
                except:
                    driver.quit()





this code works used one of the 3 tags here

need to combine 3 tags in a single code.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
import re
import pandas as pd
import os
import html5lib
import json
import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions
import pandas as pd


url = "https://www.google.com/"
chromedriver = ('/home/me/chromedriver/chromedriver.exe')
driver = webdriver.Chrome(chromedriver)
driver.implicitly_wait(30)
driver.get(url)
search = driver.find_element_by_name('q')
search.send_keys('polasa',' pincode')
search.send_keys(Keys.RETURN)
elements=driver.find_element_by_xpath("//div[@class='Z0LcW']")
elements.text


``




1 Answer 1

1

You don't really need 3 try-catchs. You can do this without throwing exceptions by locating elements (plural) given a locator and then check the length of the collection returned. If length = 0, no elements were found.

The locators you are using don't require XPath so you can instead use a CSS selector and combine all three with an OR and avoid the three checks. (Note: you can do the same thing with XPath but the results are messier and harder to read)

Here are your 3 locators combined into one using OR (the comma) in CSS selector syntax

div.IAznY div.title, div.Z0LcW, div.Z0LcW.AZCkJd

...and the updated code using the combined locator and without the nested try-catch.

...
locator = (By.CSS_SELECTOR, 'div.IAznY div.title, div.Z0LcW, div.Z0LcW.AZCkJd')

for i in mlist1:
    search.send_keys(i,' pincode')
    search.send_keys(Keys.RETURN)
    WebDriverWait(driver, 10).until(expected_conditions.visibility_of_element_located(*locator)
    elements = driver.find_elements_by_css_selector(*locator)
    df_output = df_output.append(pd.DataFrame(columns=["City", "pincode"], data=[[i,elements[0].text]]))

driver.quit()

NOTE: I used your original locators and wasn't returning any results with any of the three. Are you sure they are correct?

Also note... I pulled the driver.quit() out of the loop. I'm not sure if you intended it to be inside or not but from the code provided, if the try succeeds in the first iteration, the browser will quit. You only have one item so you probably didn't notice this yet but would have been confused when you added another item to the iteration.

Sign up to request clarification or add additional context in comments.

7 Comments

TypeError Traceback (most recent call last) <ipython-input-13-d443e7126369> in <module>() 14 search.send_keys(i,' pincode') 15 search.send_keys(Keys.RETURN) ---> 16 WebDriverWait(driver, 10).until(expected_conditions.visibility_of_element_located(*locator)) 17 elements = driver.find_elements_by_css_selector(*locator) 18 df_output = df_output.append(pd.DataFrame(columns=["City", "pincode"], data=[[i,elements.text]])) TypeError: __init__() takes exactly 2 arguments (3 given)
in your code the locators are fetching no data and i used driver.quit() for the loop only. I need the code to run iit loop for different usecases
this code works fine . Check this code works and fetches the data
url = "https://www.google.com/" chromedriver = ('/home/me/chromedriver/chromedriver.exe') driver = webdriver.Chrome(chromedriver) driver.implicitly_wait(30) driver.get(url) search = driver.find_element_by_name('q') search.send_keys('polasa',' pincode') search.send_keys(Keys.RETURN) elements=driver.find_element_by_xpath("//div[@class='Z0LcW']")
There was one thing I meant to ask and forgot... does data=[[i,elements.text]] work? elements is a collection. In your first try it uses elmts[0].text... should the other two be elements[0].text?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.