1

I'm trying to build a SQLite database from scraped text. Each row in the database corresponds to a string taken from a list, and for every loop another column is created and populated with new string data.

conn = sqlite3.connect('data.sqlite')
cur = conn.cursor()  

cur.executescript('''
DROP TABLE IF EXISTS Data;
CREATE TABLE Data(
id  INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
words    text)''')

while True
    url = 'www.xyz.com'
    if url == "break": break

#parse - find tag of interest
    html = urllib.request.urlopen(url)
    p_s = BeautifulSoup(html,'html.parser')
    words = str(p_s.findAll('p',{'id':'p-5'}))
    words = strip_tags(words)
    words = pd.DataFrame(words)

    col_number = col_number + 1
    col_name = ('Group', col_number)

    cur.execute('''ALTER TABLE Data ADD ? TEXT''', (col_name,))
    for i,j in words.iterrows():
        cur.execute('''INSERT OR IGNORE INTO Data (col_name)
        VALUES (?)''',(j))
    conn.commit()

When I run this code I get:

 sqlite3.Operational.Error : near "?": syntax error

Where am I going wrong? Thanks, and I apologize for my sloppy code, I'm new to Python!

1
  • Adding new data as columns is terrible design. Add the "new string data" as new rows, not as new columns. Use one additional column Group to hold the col_name values. It'll be much easier to use the result. Commented Jul 17, 2017 at 23:23

1 Answer 1

1

Change to:

    cur.execute('''ALTER TABLE Data ADD {} TEXT'''.format('Group ' + str(col_number)))
    for i,j in words.iterrows():
        cur.execute('''INSERT OR IGNORE INTO Data ({})
        VALUES (?)'''.format(col_name), (j,))
    conn.commit()

or

cur.execute('''ALTER TABLE Data ADD {} TEXT'''.format('Group ' + str(col_number)))
        for i,j in words.iterrows():
            cur.execute('''INSERT OR IGNORE INTO Data {}
            VALUES (?)'''.format(col_name), (j,))
        conn.commit()

one of those should work

Sign up to request clarification or add additional context in comments.

4 Comments

Unfortunately neither are working for me. The return error is on the line: cur.execute('''ALTER TABLE Data ADD {} TEXT'''.format(col_name)) and the error is: sqlite3.Operational.Error: near "(": syntax error
try cur.execute('''ALTER TABLE Data ADD {} TEXT'''.format('Group ' + col_number)) and see if it passes the first one
@cody Change that to 'Group ' + str(col_number) or equivalent, else you're adding a str to an int.
Ok, thanks @alexis and cody, I've come up with a work around with your help. Instead of writing from a dataframe to the database, I wrote from a list to the database instead, and that seems to work using the above methods. Now I'm left with a problem where each time I append the database with a new list, the new data is still attached to the primary key from the first column. So my first column has 100 elements and ends at row 99, and my second column has 150 words but starts at row 100. Any idea how to work around this? I'd like each column to start at element row 1.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.