1

I would like to get data from an e-commerce website. However, I can get products but no more than 24. The site is using page index query parameter. I am sending the query parameter but still, the script gets 24 products.

The code, result and the website --> Screenshots

Code:

import requests
import sqlite3
from bs4 import BeautifulSoup

db = sqlite3.connect('veritabani.sqlite')
cursor = db.cursor()
cursor.execute("CREATE TABLE products (id, product, price)")
url = 'https://www.trendyol.com/cep-telefonu-x-c103498'
html_text = requests.get(url,params={'q': 'pi:5'}).text
soup = BeautifulSoup(html_text, 'lxml')
print(soup.contents)
products = soup.find_all("div", {"class": "p-card-wrppr"})
for product in products:
    product_id = product['data-id']
    product_name = product.find("div", {"class": "prdct-desc-cntnr-ttl-w two-line-text"}).find("span",{"class": "prdct-desc-cntnr-name"})["title"]
    price = product.find_all("div", {"class": "prc-box-sllng"})[0].text
    cursor.execute("INSERT INTO products VALUES (?,?,?)", (product_id,product_name,price))
    print(product_id,product_name,price)
db.commit()
db.close()
1
  • Not clear: Do you mean you want the page size to be bigger, or that setting pi (page index?) you still get the first 25 results? Commented Jul 14, 2021 at 6:03

1 Answer 1

5

The website you have mentioned uses scroll pagination. But you still can get the data you want.

First of all, you are passing params wrong. Try to change this line

html_text = requests.get(url,params={'q': 'pi:5'}).text

with this:

html_text = requests.get(url,params={'pi':'5'}).text

And you will get 24 products on page number 5. So basically, you can go like this:

for i in range(10):
    html_text = requests.get(url,params={'pi': str(i)}).text
    soup = BeautifulSoup(html_text, 'lxml')
    print(soup.contents)
    products = soup.find_all("div", {"class": "p-card-wrppr"})
    for product in products:
        product_id = product['data-id']
        product_name = product.find("div", {"class": "prdct-desc-cntnr-ttl-w two-line-text"}).find("span",{"class": "prdct-desc-cntnr-name"})["title"]
        price = product.find_all("div", {"class": "prc-box-sllng"})[0].text
        cursor.execute("INSERT INTO products VALUES (?,?,?)", (product_id,product_name,price))
        print(product_id,product_name,price)
    db.commit()
    db.close()

This should get you products on 10 pages.

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you. It works. I understand the parameter part. But why should I use for loop?
If you want to iterate over many pages of the website in one execution of the script, then you have to use some loops
Now, I understand. It is like a page. Every 24 product different page index. So I give different numbers to the pi I always getting 24 products. Thanks a lot.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.