0

I previously wrote a program to analyze stock info and to get historical data I used NASDAQ. For example in the past if I wanted to pull a years worth of price quotes for CMG all I needed to do was make a request to the following link h_url= https://www.nasdaq.com/api/v1/historical/CMG/stocks/2020-06-30/2019-06-30 to download a csv of the historical quotes. However, now when I make the request I my connection times out and I cannot get any response. If I just enter the url into a web-browser it still downloads the file just fine. Some example code is below:


h_url= 'https://www.nasdaq.com/api/v1/historical/CMG/stocks/2020-06-30/2019-06-30'
page_response = rq.get(h_url, timeout=30)
page=bs(page_response.content, 'html.parser')

dwnld_fl=os.path.join(os.path.dirname(__file__),'Temp_data','hist_CMG_data.txt')
fl=open(dwnld_fl,'w')
fl.write(page.text)

Can someone please let me know if this works for them or if there is something I should do differently to get it to work again ? This is only an example not the actual code so if I accidentally made a simple syntax error you can assume the actual file is correct since it has worked without issue in the past.

1 Answer 1

1

You are missing the headers and making a request to an invalid URL (the file downloaded in a browser is empty).

import requests
from bs4 import BeautifulSoup as bs


h_url= 'https://www.nasdaq.com/api/v1/historical/CMG/stocks/2019-06-30/2020-06-30'
headers = {
    'authority': 'www.nasdaq.com',
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36',
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
    'sec-fetch-site': 'none',
    'sec-fetch-mode': 'navigate',
    'sec-fetch-user': '?1',
    'sec-fetch-dest': 'document',
    'accept-language': 'en-US,en;q=0.9',
}

page_response = requests.get(h_url, timeout=30, allow_redirects=True, headers=headers)

with open("dump.txt", "w") as out:
    out.write(str(page_response.content))

This will result in writing a byte string to a the file "dump.txt" of the data received. You do not need to use BeautifulSoup to parse HTML, as the response is a text file, not HTML.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.