2

I would like to scrape all company info under "Symbol", "Name", and "Earnings Call Time" from the following page: https://finance.yahoo.com/calendar/earnings

This is what I have so far for just company name, but I'm getting the error:

"NoSuchElementException: no such element: Unable to locate element: {"method":"xpath","selector":"//*[@id='cal-res-table']/div[1]/table/tbody/tr[1]/td[2]"} (Session info: chrome=86.0.4240.198)"

from selenium import webdriver
import datetime

tomorrow = (datetime.date.today() + datetime.timedelta(days=1)).isoformat() #get tomorrow in iso format as needed
url = "https://finance.yahoo.com/calendar/earnings?day="+tomorrow
print ("url: " + url)

driver = webdriver.Chrome("C:/Users/jrod94/Downloads/chromedriver_win32/chromedriver.exe")
driver.get(url)
element = driver.find_element_by_xpath("//*[@id='cal-res-table']")
Companies = [a.get_attribute("Company") for a in element]

driver.close()

3 Answers 3

3

How about using pandas?

import datetime
import pandas as pd

pd.set_option('display.max_column',None)
tomorrow = (datetime.date.today() + datetime.timedelta(days=1)).isoformat() #get tomorrow in iso format as needed'''
url = pd.read_html("https://finance.yahoo.com/calendar/earnings?day="+tomorrow, header=0)
table = url[0]
print(table)

Ouput:-

  Symbol                         Company  Earnings Call Time EPS Estimate  \
0    WBAI                     500.Com Ltd  After Market Close            -   
1    BRBR             Bellring Brands Inc                 TAS         0.19   
2     BKE                      Buckle Inc  Before Market Open         0.54   
3     BNR        Burning Rock Biotech Ltd                 TAS        -0.12   
4     IEC            IEC Electronics Corp                 TAS            -   
5    GEOS      Geospace Technologies Corp                 TAS            -   
6    DREM  Dream Homes & Development Corp   Time Not Supplied            -   
7    DXLG        Destination XL Group Inc  Before Market Open            -   
8      FL                 Foot Locker Inc  Before Market Open         0.61   
9     HHR            HeadHunter Group PLC                 TAS         0.14   
10    HHR            HeadHunter Group PLC  Before Market Open         0.14   
11    RMR                   RMR Group Inc  Before Market Open         0.39   
12    GSX                 GSX Techedu Inc  Before Market Open        -0.31   
13    GSX                 GSX Techedu Inc                 TAS        -0.31   
14   HIBB              Hibbett Sports Inc  Before Market Open         0.45   
15   HAYN        Haynes International Inc                 TAS         -0.7   
16   IIIV                i3 Verticals Inc                 TAS         0.18   
17   AIHS          Senmiao Technology Ltd  Before Market Open           
         
Sign up to request clarification or add additional context in comments.

5 Comments

This is exactly what I was looking for - thanks! I'd like to append info from 2 dates to the same dataframe. I've tried this code but can't get it to work. Can you please help?
import datetime import pandas as pd date = (datetime.date.today() + datetime.timedelta(days=1)).isoformat() #get tomorrow in iso format as needed''' for i in range(2): try: date = (datetime.date.today() + datetime.timedelta(days = i )).isoformat() #get tomorrow in iso format as needed''' pd.set_option('display.max_column',None) url = pd.read_html("finance.yahoo.com/calendar/earnings?day="+date, header=0) table = url[0] table.append(table) print(table) except ValueError: continue
@JuneSmith It could be easily done, However, I would suggest you ask a new question with a screenshot of the table. I don't think answering that in the comments would be appropiate. Thanks.!
I've asked the question here: "Scraping and appending data while looping through html tables"
@JuneSmith Check the URL again and make sure you use the tag pandas in the question.
2

Actually, your codes give an error but not in the same line with you, but later. Maybe the problem is page is not loaded when you try to reach the element. A little delay before the line that error occurs may solve the problem.

from selenium import webdriver
import datetime
import time

tomorrow = (datetime.date.today() + datetime.timedelta(days=1)).isoformat() #get tomorrow in iso format as needed
url = "https://finance.yahoo.com/calendar/earnings?day="+tomorrow
print ("url: " + url)

driver = webdriver.Chrome("C:/Users/jrod94/Downloads/chromedriver_win32/chromedriver.exe")
driver.get(url)
time.sleep(1) # you can increase 1 if it still does not work
element = driver.find_element_by_xpath("//*[@id='cal-res-table']")
Companies = [a.get_attribute("Company") for a in element]

driver.close()

Comments

1

Since your question is regarding selenium:

You should take a look about Selenium-Waits

Where you are waiting for presents of all elements located within the HTML source code,the following code should describe it:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


def main(url):
    driver = webdriver.Firefox()
    driver.get(url)
    try:
        cnames = [x.text for x in WebDriverWait(driver, 10).until(
            EC.presence_of_all_elements_located(
                (By.CSS_SELECTOR, "td[aria-label='Company']"))
        )]
    finally:
        print(cnames)
        driver.quit()


main("https://finance.yahoo.com/calendar/earnings")

Output:

['111 Inc', '360 DigiTech Inc', 'American Software Inc', 'American Software Inc', 'Corporacion America Airports SA', 'Atkore International Group Inc', 'Atkore International Group Inc', 'Helmerich and Payne Inc', 'Amtech Systems Inc', 'Amtech Systems Inc', 'Delta Apparel Inc', 'Delta Apparel Inc', 'Bellring Brands Inc', 'Berry Global Group Inc', 'Beacon Roofing Supply Inc', 'Natural Grocers By Vitamin Cottage Inc', "BJ's Wholesale Club Holdings Inc", 'Entera Bio Ltd', 'SG Blocks Inc', 'SG Blocks Inc', 'BEST Inc', 'Brady Corp', 'BioHiTech Global Inc', 'BioHiTech Global Inc', 'Oaktree Strategic Income Corporation', 'Caleres Inc', 'Pennantpark Investment Corp', 'Geospace Technologies Corp', 'Canadian Solar Inc', 'Oaktree Specialty Lending Corp', 'Matthews International Corp', 'Clearsign Technologies Corp', "Children's Place Inc", 'Elys Game Technology Corp', 'Dada Nexus Ltd', 'ESCO Technologies Inc', 'Euroseas Ltd', 'Fangdd Network Group Ltd', 'Fangdd Network Group Ltd', 'Golden Ocean Group Ltd', 'Hoegh LNG Partners LP', 'Post Holdings Inc', 'Huize Holding Ltd', 'Haynes International Inc', "Macy's Inc", 'OneWater Marine Inc', 'OneWater Marine Inc', 'Woodward Inc', 'StealthGas Inc', 'Maximus Inc', 'Ross Stores Inc', 'Intuit Inc', 'Ooma Inc', 'Williams-Sonoma Inc', 'Precipio Inc', 'NetEase Inc', 'Workday Inc', 'i3 Verticals Inc', 'Knot Offshore Partners LP', 'Maxeon Solar Technologies Ltd', 'Opera Ltd', 'Puxin Ltd', 'Puxin Ltd']

Note: You don't need to use selenium as it's will slow down your task at all.

Also i see there's no reason to import a huge library such as pandas to read just an HTML table.

Simply you can pickup the target via the following code where you will get the exact call date:

import requests
import re
import json
import csv

keys = ['ticker', 'companyshortname', 'startdatetime']


def main(url):
    r = requests.get(url)
    goal = json.loads(re.search(r"App\.main.*?({.+})", r.text).group(1))
    target = [[item[k] for k in keys] for item in goal['context']
              ['dispatcher']['stores']['ScreenerResultsStore']['results']['rows']]
    with open("result.csv", 'w', newline="") as f:
        writer = csv.writer(f)
        writer.writerow(keys)
        writer.writerows(target)


main("https://finance.yahoo.com/calendar/earnings")

Output: view-online

enter image description here

6 Comments

In what way is pandas a "Huge" library? Isn't it specifically useful in reading tables from sites?
@AbrarAhmed pandas include multiple lib within the background such as numpy. even pd.read_html is using requests lib in the background. so that's not logical to import pandas to use it for just read html. also i think the logical way to answer the question which being asked firstly then offer the other way for the OP
Sorry, that makes no sense to me..but..good to know.Cheers.
@ahmedamerican I also used to think the same. So, I asked on meta. There is no such rule here. meta.stackoverflow.com/questions/402902/…
@AbrarAhmed i just noticed that your post was 2 days ago. btw the response which you received there is based on different example which you shared for the viewers. but let me confirm for you. If you asked me about an issue with x so i should return back to you with an answer regarding x before offering you y. in that case the OP and viewers will get the exact point.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.