0

I am parcing site with comics. First I am parcing catalog and creating parent table with names, urls of covers and status of comics. After that I want to create child tables for each comic with list of chapters, release date of chapter and url of chapter.

Below is code that I am using. Unfortunately child tables are not created at all and I dont understand why. Please help what I missed.

import sqlite3
from time import sleep
import requests
from bs4 import BeautifulSoup

connection = sqlite3.connect('parser_results.db')
cursor = connection.cursor()

cursor.execute('''
CREATE TABLE IF NOT EXISTS Comics (
  id INTEGER PRIMARY KEY,
  name TEXT NOT NULL,
  cover TEXT NOT NULL,
  status TEXT NOT NULL
  )
''')

def get_url():
  for count in range(1,2):
    url = f"https://asuratoon.com/manga/?page={count}&order=update"
    response = requests.get(url, headers=headers)
    soup = BeautifulSoup(response.text, "html.parser")
    data_url = soup.find_all('div', class_='bsx')

    for comic in data_url:
      comic_url = comic.find('a').get('href')
      yield comic_url

for comic_url in get_url():
  response = requests.get(comic_url, headers=headers)
  sleep(3)
  soup = BeautifulSoup(response.text, "html.parser")

  data_general = soup.find('div', class_='bixbox animefull')
  name = data_general.find('h1', class_='entry-title').text
  cover = data_general.find('img').get('src')
  status = data_general.find('div', class_='imptdt').text
  status = status.replace('Status ', '')
  cursor.execute('INSERT INTO Comics (name, cover, status) VALUES (?, ?, ?)', (name, cover, status))
  connection.commit()
  
cursor.execute('SELECT id FROM Comics')
ids = cursor.fetchall()
for id in ids:
  cursor.execute('''CREATE TABLE IF NOT EXISTS ChildTable_%s (
                          id INTEGER PRIMARY KEY,
                          chapter_name TEXT,
                          release_date TEXT,
                          chapter_url TEXT
                      )''' % (ids[0]))
  for comic_url in get_url():
    response = requests.get(comic_url, headers=headers)
    soup = BeautifulSoup(response.text, "html.parser")  #lxml
    data = soup.find_all('div', class_='eph-num')
    for i in data:
      chapter_name = i.find('span', class_='chapternum').text
      release_date = i.find('span', class_='chapterdate').text
      chapter_url = i.find('a').get('href')
      cursor.execute('INSERT INTO ChildTable_%s (chapter_name, release_date, chapter_url) VALUES (?, ?, ?)', (chapter_name, release_date, chapter_url))
      connection.commit()
connection.close()

I tried code above and creation of child tables dont happen. I dont know which other options I have for my task.

3
  • 1
    Make sure to post actual code to reproduce the issue (for example, missing all imports). Commented Dec 22, 2023 at 15:31
  • In each iteration of for id in ids you are running CREATE CHILD TABLE_{ids[0]}... where ids[0] is going to be the same value every iteration. Did you mean id instead of ids[0]? More broadly speaking, and unrelated to you error, a single child table would be most appropriate. Creating a database obejct for every child instance is bad design. Just add a column in a single child table that holds the id of the child for the record you are writing. It may help with the issue you are facing in a round-about way too. Commented Dec 22, 2023 at 16:04
  • Thx, I created tables for each and it really eats a lot of resources, performance wise to have one child table is better I think. Will try to do. Commented Dec 22, 2023 at 19:35

1 Answer 1

0

I reduced your code and used a dummy ids list, but if you change the child create sql command like my example it should work:

import sqlite3

connection = sqlite3.connect('parser_results.db')
cursor = connection.cursor()

cursor.execute('''
CREATE TABLE IF NOT EXISTS Comics (
  id INTEGER PRIMARY KEY,
  name TEXT NOT NULL,
  cover TEXT NOT NULL,
  status TEXT NOT NULL
  )
''')

ids = [(1,"urlOne"),(2,"urlTwo")]
for id in ids:
    child = "ChildTable_"+str(id[0])
    cursor.execute(f"""CREATE TABLE IF NOT EXISTS [%s] (
        id INTEGER PRIMARY KEY,
        chapter_name TEXT,
        release_date TEXT,
        chapter_url TEXT
        );""" % child)
connection.commit()
connection.close()
Sign up to request clarification or add additional context in comments.

1 Comment

Yeah thx, I changed accoring to your suggestions and it worked

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.