I recently changed my religion from Matlab to Python for my data-analysis. I often need to import a lot of data-files which I used to store in a structure array in Matlab. This way I could both store a lot of different data in a single structure-array where the parameters to which the data belongs will be stored within this structure. I would end up with structures like:
data(i).parameterx = a
data(i).parametery = b
data(i).data1 = [3, 5, 7, 8, 423, 2, 56, 6]
data(i).data2 = [4, 2, 5; 4, 6, 2; 5, 1, 3]
The length of data is the number of data-files I imported. This way I can retrieve all the parameters of each data-set based on the index of data.
So I want to do the same in Python. I tried using class but it seems that I can't store it in an array.
What I tried was the following:
#!/usr/bin/env python
# Import modules
import numpy as np
import fnmatch
import os
# Set up all elements of the path
class path:
main, dir = 'root/dir', 'data_dir'
complete = os.path.join(main,dir)
ext = 'odt' # Extension of the file to look for
# Initialize data array and data-class
all_data=[]
class odt_data:
pass
# Loop through all files in path.complete and find all ODT-files
for (root, dirnames, filenames) in os.walk(path.complete):
for filename in fnmatch.filter(filenames, '*.'+path.ext):
# Only ODT-files
# Extract parameters from filename
parameters = filename.split('__')
odt_data.parameterx = parameters[0]; odt_data.parametery = parameters[2]
# Import data from file and assign to correct attribute of 'odt_data'
file=os.path.join(root, filename)
data=np.loadtxt(file)
odt_data.data1 = data[:,0]; odt_data.data2 = data[:,1]
# Append new data to array of all data
all_data.append(odt_data)
The problem here is that is doesn't really save the data in odt_data but rather the reference to it. Hence when I do print(all_data) the result is:
[<class '__main__.odt_data'>, <class '__main__.odt_data'>, <class '__main__.odt_data'>]
And therefore only the last imported data is stored: all_data[0] is exactly the same as all_data[1] and so on. Hence, I'm not able to access the data of the first or second imported file, only that of the last one. So when I call all_data[1].parameterx I get the parameterx value of the last imported file, not the second!
NOTE: I don't care about how it prints, I only care about accessing all the imported data
Anybody knows a way to store the actual data in class odt_data in an array (or possibly use something else than a class)