2

I am trying to write. code that will allow a user to select specific columns from a sqlite database which will then be transformed into a pandas data frame. I am using a test database titled test_database.db with a table titled test. The table has three columns, id, value_one, and value_two. The function I am showing exists within a class that establishes a connection to the database and in this function the user only needs to pass the table name and a list of columns that they would like to extract. For instance in command line sqlite I might type the command select value_one, value_two from test if I wanted only to read in the columns value_one and column_two from the table test. If I type this command into command line the method works. However, in this case I use python to build the text string which is fed into pandas.read_sql_query() and the method does not work. My code is shown below

class ReadSQL:
    def __init__(self, database):
        self.database = database
        self.conn = sqlite3.connect(self.database)
        self.cur = self.conn.cursor()

    def query_columns_to_dataframe(table, columns):
        query = 'select '
        for i in range(len(columns)):
            query = query + columns[I] + ', '
        query = query[:-2] + ' from ' + table
        # print(query)
        df = pd.read_sql_query(query, self.conn)
        return

    def close_database()
        self.conn.close
        return

test = ReadSQL(test_database.db)
df = query_columns_to_dataframe('test', ['value_one', 'value_two'])

I am assuming my problem has something to do with the way that query_columns_to_dataframe() pre-processes the information because if I uncomment the print command in query_columnes_to_dataframe() I get a text string that looks identical to what works if I just type it directly into command line. Any help is appreciated.

2 Answers 2

1

I mopped up a few mistakes in your code to produce this, which works. Note that I inadvertently changed the names of the fields in your test db.

import sqlite3
import pandas as pd

class ReadSQL:
    def __init__(self, database):
        self.database = database
        self.conn = sqlite3.connect(self.database)
        self.cur = self.conn.cursor()

    def query_columns_to_dataframe(self, table, columns):
        query = 'select '
        for i in range(len(columns)):
            query = query + columns[i] + ', '
        query = query[:-2] + ' from ' + table
        #~ print(query)
        df = pd.read_sql_query(query, self.conn)
        return df

    def close_database():
        self.conn.close
        return

test = ReadSQL('test_database.db')
df = test.query_columns_to_dataframe('test', ['value_1', 'value_2'])
print (df)

Output:

   value_1  value_2
0        2        3
Sign up to request clarification or add additional context in comments.

Comments

0

Your code are full of syntax errors and issues

  1. The return in query_columns_to_dataframe should be return df. This is the primary reason why your code does not return anything.
  2. self.cur is not used
  3. Missing self parameter when declaring query_columns_to_dataframe
  4. Missing colon at the end of the line def close_database()
  5. Missing self parameter when declaring close_database
  6. Missing parentheses here: self.conn.close
  7. This df = query_columns_to_dataframe should be df = test.query_columns_to_dataframe

Fixing these errors and your code should work.

1 Comment

My apologies, the code I presented on stack overflow was a quickly written example from the larger code. Only one of the syntax errors that you correctly found are in the actual code.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.