Read SQL query output to a Python dataframe

Question

I need to read from a SQL output file which has the following format to a python or Pandas dataframe, what could be the best possible approach?

-[ RECORD 1 ]--------------------------------
a    |    test
b    |    test
c    |    test
-[ RECORD 2 ]--------------------------------
a    |    test
b    |    test
c    |    test

topsail · Accepted Answer · 2022-07-05 02:56:49Z

This code will transform the input file into a "normal" csv - it isn't general purpose so since your example is probably artificial (you may not really have columns called a, b, c and values that are all test) there may be tweaks needed - but this is a start. I suppose it is inspired by sed so must be taken with a grain of salt!

1) transform the file into a regular csv file

def transform_to_csv(in_file_path, out_file_path):
    line = None
    column_names = []
    values = []
    first_record = True
    with open(in_file_path) as infile:
        with open (out_file_path, "w") as outfile:
            infile.readline() #skip first line
            while True:
                line = infile.readline().rstrip("\n")
                if not line:
                    # write the last record
                    outfile.write(",".join(values) + "\n")
                    break
                elif line.startswith("-"):
                    # finished with a record
                    if(first_record):
                        outfile.write(",".join(column_names) + "\n")
                        first_record = False
                    outfile.write(",".join(values) + "\n")
                    values = []
                else:
                    # accumulating fields for the next record
                    name, value = tuple(line.split("|"))
                    values.append(value.strip())
                    if(first_record):
                        column_names.append(name.strip())

We get a new file in csv format:

a,b,c
test,test,test
test,test,test

2) now do normal pandas stuff

import pandas as pd
infile = "in.txt"
outfile = "out.csv"
transform_to_csv(infile, outfile)
df = pd.read_csv("out.csv")
print(df.head())

Collectives™ on Stack Overflow

Read SQL query output to a Python dataframe

1 Answer 1

1) transform the file into a regular csv file

2) now do normal pandas stuff

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1) transform the file into a regular csv file

2) now do normal pandas stuff

Comments

Your Answer

Sign up or log in

Post as a guest

Related