0

I am trying to scrape data from https://www.doordash.com/food-delivery/chicago-il-restaurants/

The idea is to scrape all the data regarding the different restaurant listings on the website. The site is divided into different cities, but I only require restaurant data for Chicago.

All restaurant listings for the city have to be scraped along with any other relevant data about the respective restaurants (Ex: Reviews, Rating, Cuisine, address, state etc). I need to capture all the respective details(currently 4,326 listings) for the city in the Excel.

I have tried to extract the restaurant name, cuisine, ratings and review inside the class named "StoreCard_root___1p3uN". But No datas have been displayed. The output is blank.


from selenium import webdriver

chrome_path = r"D:\python project\chromedriver.exe"

driver = webdriver.Chrome(chrome_path)

driver.get("https://www.doordash.com/food-delivery/chicago-il-restaurants/")

driver.find_element_by_xpath("""//*[@id="SeoApp"]/div/div[1]/div/div[2]/div/div[2]/div/div[2]/div[1]/div[3]""").click()

posts = driver.find_elements_by_class_name("StoreCard_root___1p3uN")

for post in posts:
    print(post.text) ```


2
  • What is your question? Have you received an error whilst trying to scrape data from this website? If so please tell us what error you are trying to solve. I'm not understanding what you require Commented Dec 5, 2019 at 10:18
  • make your life easy man ! use API api.doordash.com/v2/seo_city_stores/… Commented Dec 5, 2019 at 10:26

3 Answers 3

2

you can use the API url as the data rendered from it actually via XHR request.

iterate over the API link below and scrape whatever you want.

https://api.doordash.com/v2/seo_city_stores/?delivery_city_slug=chicago-il-restaurants&store_only=true&limit=50&offset=0

You will just loop over this parameter offset=0 by increasing it +50 each time as each page will shown 50 items till you reach 4300 as it's the last page ! simply by range(0, 4350, 50)

import requests
import pandas as pd

data = []
for item in range(0, 4350, 50):
    print(f"Extracting item# {item}")
    r = requests.get(
        f"https://api.doordash.com/v2/seo_city_stores/?delivery_city_slug=chicago-il-restaurants&store_only=true&limit=50&offset={item}").json()
    for item in r['store_data']:
        item = (item['name'], item['city'], item['category'],
                item['num_ratings'], item['average_rating'], item['average_cost'])
        data.append(item)

df = pd.DataFrame(
    data, columns=['Name', 'City', 'Category', 'Num Ratings', 'Average Ratings', 'Average Cost'])
df.to_csv('output.csv', index=False)
print("done")

Sample of Output:

enter image description here

View Output online: Click Here

Full Data is here: Click Here

Sign up to request clarification or add additional context in comments.

2 Comments

Comments are not for extended discussion; this conversation has been moved to chat.
How do we get address ?
0

I was faced with this issue too, but I solved it using selenium and BeautifulSoup by doing the following:

  1. Make sure the algorithm clicks button to reveal Menu and prices if necessary
  2. The menu and prices have to be processed because they will might come off as nested list after the extraction from parsing so the get_text() function won't work on them right away. The code and explanation can be found in this medium article

Tackling empty list web scraping with selenium

Comments

0

I have checked out the API that αԋɱҽԃ αмєяιcαη mentions. They also had an endpoint for restaurant info.

URL https://api.doordash.com/v2/restaurant/[restaurantId]/

It was working until recently when it started returning {"detail":"Request was throttled."}

Has anyone had the same issue / found a workaround?

2 Comments

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.
If you have a new question, please ask it by clicking the Ask Question button. Include a link to this question if it helps provide context. - From Review

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.