0

I am trying to scrape data from similarweb The data is in the form of a chart and I am trying to scrape the month and the value associated with it. here's the code:

websites = ['https://www.similarweb.com/website/zalando.de/#overview', 'https://www.similarweb.com/website/asos.com/#overview',
                'https://www.similarweb.com/website/aboutyou.de/#overview', 'https://www.similarweb.com/website/boohoo.com/#overview',
                'https://www.similarweb.com/website/deliveryhero.com/#overview', 'https://www.similarweb.com/website/justeattakeaway.com/#overview',
                'https://www.similarweb.com/website/hellofresh.com/#overview', 'https://www.similarweb.com/website/blueapron.com/#overview',
                'https://www.similarweb.com/website/shop.adidas.co.in/#overview', 'https://www.similarweb.com/website/nike.com/#overview',
                'https://www.similarweb.com/website/in.puma.com/#overview', 'https://www.similarweb.com/website/hugoboss.com/#overview']

    options = webdriver.ChromeOptions()
    options.add_argument('start-maximized')
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option("useAutomationExtension", False)

    browser = webdriver.Chrome(ChromeDriverManager().install(), options=options)
    delays = [7, 4, 6, 2, 10, 19]
    delay = np.random.choice(delays)
    for crawler in websites:
        browser.get(crawler)
        time.sleep(2)

        time.sleep(delay)
        website_names = browser.find_element_by_xpath('/html/body/div[1]/main/div/div/section[1]/div[1]/div/div[1]/a').get_attribute("href")
        total_visits = browser.find_element_by_xpath('/html/body/div[1]/main/div/div/div[2]/div[2]/div/div[3]/div/div/div/div[2]/div/span[2]/span[1]').text
        avg_visit_duration = browser.find_element_by_xpath('/html/body/div[1]/main/div/div/div[2]/div[2]/div/div[3]/div/div/div/div[3]/div/span[2]/span').text
        pages_per_visit = browser.find_element_by_xpath('/html/body/div[1]/main/div/div/div[2]/div[2]/div/div[3]/div/div/div/div[4]/div/span[2]/span').text
        bounce_rate = browser.find_element_by_xpath('/html/body/div[1]/main/div/div/div[2]/div[2]/div/div[3]/div/div/div/div[5]/div/span[2]/span').text
        months = browser.find_elements(By.XPATH, "//*[local-name() = 'svg']/*[local-name()='g'][6]/*/*")
        for date in months:
            print(date.text)

        tooltip = browser.find_element(By.XPATH, "//*[local-name() = 'svg']/*[local-name()='g'][8]/*[local-name()='text']")
        ActionChains(browser).move_to_element(tooltip).perform()
        month_value = browser.find_element(By.XPATH, "//*[local-name() = 'svg']/*[local-name()='g' and @class='highcharts-tooltip']/*[local-name()='text']")
        print(month_value.text)

        # printing all scraped data
        print('Website Names:', website_names)
        print('Total visits:', total_visits)
        print('Average visit duration:', avg_visit_duration)
        print('Pages per visit:', pages_per_visit)
        print('Bounce rate:', bounce_rate)

Inspite of giving the correct Xpaths I am facing an error like: selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"/html/body/div[1]/main/div/div/div[2]/div[2]/div/div[4]/div[1]/div[2]/div[2]/div[1]/div/svg/g[8]/text/tspan[1]"} (Session info: chrome=90.0.4430.93)

When I casually open the website The graph displays months as November 2020, December 2020, January 2021,...March 2021. But upon inspecting, it displays months as 7th Nov, 23rd Nov, 15th Dec, 30th Dec,...

is it because of this it is giving me the NoSuchElementException? Please help!

EDIT: tried using this method, got empty lists []

element = WebDriverWait(browser, 10).until(EC.presence_of_element_located((By.CLASS_NAME, "highcharts-series")))
        test = browser.find_elements_by_xpath("//*[name()='svg']//*[name()='g' and @class='highcharts-series']/*[name()='path']")

        res = []
        for el in test:
            hover = ActionChains(browser).move_to_element(el)
            hover.perform()
            date = browser.find_elements_by_xpath('//*[@id="highcharts-0"]/svg/g[8]/text/tspan[1]').text
            price = browser.find_elements_by_xpath('//*[@id="highcharts-0"]/svg/g[8]/text/tspan[3]').text
            print('dd',date)
            print('pr', price)
3
  • 1
    Please mention navigation steps ? Commented May 10, 2021 at 5:55
  • @cruisepandey I have updated the code, what exactly does navigation steps mean? Commented May 10, 2021 at 6:03
  • Can you describe clearer what month elements and what monthly visits are you trying to access there? Are you trying to hover over the chart and catch the tooltip values? Commented May 10, 2021 at 6:05

1 Answer 1

1

The elements are inside svg tag, you would have to change your locater here.

//*[local-name() = 'svg']/*[local-name()='g'][6]/*/*

This represent months with their value. you can store everything in a list and can print them.

Something like this :

months = driver.find_elements(By.XPATH, "//*[local-name() = 'svg']/*[local-name()='g'][6]/*/*")
for date in months:
    print(date.text)

Yes when you hover over to a particular month you can use the below xpath

//*[local-name() = 'svg']/*[local-name()='g' and @class='highcharts-tooltip']/*[local-name()='text']

month_value = driver.find_element(By.XPATH, "//*[local-name() = 'svg']/*[local-name()='g' and @class='highcharts-tooltip']/*[local-name()='text']")
print(month_value.text)

to print the respective value.

for hovering to a particular month you can use the below code :

tooltip = driver.find_element(By.XPATH, "//*[local-name() = 'svg']/*[local-name()='g'][8]/*[local-name()='text']")
ActionChains(driver).move_to_element(tooltip).perform()
Sign up to request clarification or add additional context in comments.

1 Comment

Comments are not for extended discussion; this conversation has been moved to chat.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.