I'm running a script in python and I'm interested in two of the outputs that are calculated in the script. They are both arrays. What I want to do is save these arrays every time I run the script in order to keep track of the results. Then I might need to use all these saved variables in a different script that makes some kind of comparison between the variables. In general I would like to be able to use these variables every time I want to and perform some kind of analysis of the values in the arrays. Thus, I was wondering if there is any way to save these two arrays as dataframes and then import them with pandas in my script. Or is there a different way that you would recommend?
2 Answers
You can create a dataframe from a dict of equal length lists or Numpy arrays:
data = { 'character' : [ 'Pooh', 'Eeore', 'Rabbit', 'Piglet'], 'age' : [5, 10, 7, 3], 'colour' : [ 'Yellow', 'Grey', 'Brown', 'Pink'] }
frame = pd.DataFrame(data)
to write out use DataFrame to_csv method:
data.to_csv('YOUR_FILE/HERE.csv')
Comments
I use the following code to export data. This will save your dataframe as a text file with the columns separated by tabs.
expData = pd.DataFrame(data, columns = ['name1','name2',...,'nameN'])
expData.to_csv("file_%02d.txt" %loopIndex, sep = '\t')
The pd stands for pandas, which I imported as pd
import pandas as pd
The loop index will indicate which input you wrote away (input from loop 1, 2 ... n). This will yield an output denoted as file_01.txt, file_02.txt... .
You will need csv to export the data, so install and
import csv
To read this data, just use:
with open("file.txt", 'r') as f:
reader = csv.reader(f, dialect = 'excel', delimiter = '\t')
for row in reader:
% do something
Hope this is useful to you!