I want to extract a number of tables from an SQLite database. The tables have different number of rows and therefore it is natural to store them in a Python list to facilitate further data analysis . The following code works.
import sqlite3
import pandas as pd
conn = sqlite3.connect("Database")
data = []
data.append(pd.read_sql("""SELECT ID,Time,A,B FROM Main WHERE BatchID=='BATCH1'""", conn))
data.append(pd.read_sql("""SELECT ID,Time,A,B FROM Main WHERE BatchID=='BATCH2'""", conn))
conn.close()
print(data[0]['Time'])
Instead of repeating the code for each BatchID it would be convenient to have a for-loop, something like
conn = sqlite3.connect("Database")
data = []
batch = ['BATCH1', 'BATCH2']
for k in list(range(2)):
data.append(pd.read_sql("""SELECT ID,Time,A,B FROM Main WHERE BatchID='eval(batch[k])'""", conn))
conn.close()
print(data[0]['Time'])
But this does not work. If I try to read only one table with this technique and writing explicitly eval(batch[0]) then I get a table with only the keys, but no data.
On request I add some context to why I have a list of DataFrames. What I typically want to do is to easily plot a diagram with function how A varies with Time for different batches. The set of batches of interest can be a specific batch, or a set of batches, or all. The code for the plot should be simple and transparent.
for k in batches: ax1.plot(data[k]['Time'], data[k]['A'])
But this command-line can perhaps be simple using selection in a DataFrame of all batches with selected variables. I thought also here is a conceptually simplicity that we have a list of plots that we with the command above overlay in the same diagram.
I also like to make computations of subsets of data in a simliar way.
An alternative approach suggested below by JPI93 is to simplify the first step and make a large DataFrame containing data from all batches with the selected variables. This leads to a somewhat longer command to make the desired diagram I think. Below the code
...
data = pd.read_sql("""SELECT BatchID,ID,Time,A,B FROM Main""",conn)
index = []
index.append(data['BatchID'] == ' Batch1']
index.append(data['BatchID'] == ' Batch2']
batches = list(range(2))
Then we can plot with the following command
for k in batches:ax1.plot(data.loc[index[k],'Time'],data.loc[index[k],'A'])
I tend to favour the original plot command above, but then I need to solve the original problem of making a list of DataFrames. Or is here some other approach to make the plot command simple and readable?