1

I'm coming from Matlab and need to reimplement some data processing in python/numpy. I have data files which are in a specific format, but it can vary which variables are listed in the file.

In matlab I could write functions to stick the data arrays in handy structs, like this:

s = data_to_struct(filename)
s.altitude
s.time
s.latitude
s.longitude
s.density
s.temperature

which I could then pass around to various plotting or filtering functions. I want to do the same in Python, using an object in place of the struct. I know the Proper™ way is to use a dict, but for this very specific and limited case the dict syntax is uglier in my eyes: s.temperature vs s['temperature'], as well as being more cumersome to type. What is the best way to read a string of variable names from file, and then create variables or object members with those names?

I will do a lot of interactive plotting and data handling, and I want to make the typing and tabbing as easy as possible.

1
  • Thanks, all. I'll go with setattr() for now, and make a note of NamedTuple and Pandas for future projects. Commented Feb 8, 2014 at 13:07

4 Answers 4

2
>>> from collections import namedtuple
>>> Something = namedtuple('Something', 'a b c')
>>> a = Something(1, 2, 4)
>>> a.c
4

namedtuple documentation

this comes especially handy, if you've already deserialized your data and it exists as a collection of multiple tuples of the same length.

Sign up to request clarification or add additional context in comments.

Comments

2

Consider using Pandas. It has fast and flexible functions for reading data from various file formats, such as read_table, and DataFrame columns can be referenced by attribute lookup:

In [168]: df = pd.DataFrame(np.arange(12).reshape(4,3), columns=['foo', 'bar', 'baz'])

In [169]: df
Out[169]: 
   foo  bar  baz
0    0    1    2
1    3    4    5
2    6    7    8
3    9   10   11

[4 rows x 3 columns]

In [170]: df.bar
Out[170]: 
0     1
1     4
2     7
3    10
Name: bar, dtype: int32

Comments

2

The one problem I see with namedtuple is that you can't edit the values of the object after it's been instantiated, which I don't know if you need to do or not.

While it might be a bit convoluted, you could create an object using setattr and basically create a factory that will generate objects with your custom attributes.

An example would look like this:

def data_to_struct(*info):
    class O(object):
        def __init__(self, *args):
            for a in args:
                setattr(self, a, None)
    return O(*info)

Then, using it, you'd pass data_to_struct a list of strings that will be your attribute names and get back an object that has all of those as empty attributes.

>>> g = data_to_struct('a', 'b', 'c')
>>> g.a
>>> g.b = 3
>>> g.b
3

You could of course always choose to use kwargs also or something else to provide values to the various attributes upon instantiation.

Comments

0

Instead of obj.attr = value you can also write setattr(obj, 'attr', value) which takes the attribute name as a string.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.