3

I have a numpy array that contains a list of objects.

x = np.array([obj1,obj2,obj3])

Here is the definition of the object:

class obj():
    def __init__(self,id):
        self.id = id

obj1 = obj(6)
obj2 = obj(4)
obj3 = obj(2)

Instead of accessing the numpy array based on the position of the object, i want to access it based on the value of id.

For example:

# x[2] = obj3
# x[4] = obj2
# x[6] = obj1

After doing some research, I learned that i could make a structured array:

x = np.array([(3,2,1)],dtype=[('2', 'i4'),('4', 'i4'), ('6', 'i4')])

# x['2'] --> 3

However, the problem with this is that i want the array to take integers as indexes, and dtypes must have a name of type str. Furthermore, i don't think structured arrays can be lists of objects.

5
  • 2
    Can you tell us more about how this will actually be used? Is the real production code going to only have three elements in the array? Or how many? Do they all have unique IDs? Why not just use a dict to map from id to obj? Commented Dec 15, 2015 at 8:03
  • The array will eventually have 1mil + objects. All will have unique id's, I originally implemented it as dict. But eventually my goal was to use it as x[[val1,val2,val3,.....]] etc and return an array, and numpy arrays do a good job with this. Commented Dec 15, 2015 at 8:07
  • How is an array any better than list? What array functionality are you hoping to use? Commented Dec 15, 2015 at 8:45
  • How about using a sorted list of the ids? Or a sqlite database. Commented Dec 15, 2015 at 15:31
  • I wanted to use a numpy array because of the advantages described here: stackoverflow.com/questions/993984/…. I am essentially trying to create a numpy array subset from an already huge array. This subset will be accessed and manipulated. Commented Dec 15, 2015 at 20:40

2 Answers 2

3

You should be able to use filter() here, along with a lambda expression:

np.array(filter(lambda o: o.id == 1, x))

However, as filter() returns a list (in Python 3+, it should return an iterator), you may want to generate a new np.array from the result.

But this does not take care of duplicate keys, if you want to access your data key-like. It is possible to have more than one object with the same id attribute. You might want to control uniqueness of keys.

Sign up to request clarification or add additional context in comments.

3 Comments

Will this lookup be O(1)? Also there should be no duplicate keys
No, the lookup will be the code example as given in my post. If you want another lookup call, you have to wrap it into a function or embed it into a new class.
@jbndlr Your code is incorrect for python 3+ since you cannot create numpy array from generator (np.array(filter(...)) will create an array with size (1, ) with a generator in the first cell).
1

If you only want to be able to access subarrays "by-index" (e.g. x[2, 4]), with index as id, then you could simply create your own struct:

import collections    

class MyArray (collections.OrderedDict):
    def __init__ (self, values):
        super(MyArray, self).__init__ ((v.id, v) for v in values)
    def __rawgetitem (self, key):
        return super (MyArray, self).__getitem__ (key)
    def __getitem__ (self, key):
        if not hasattr (key, '__iter__'):
            key = (key, )
        return MyArray (self.__rawgetitem (k) for k in key)
    def __repr__ (self):
        return 'MyArray({})'.format(', '.join('{}: {}'.format(k, self.__rawgetitem(k)) for k in self.keys()))
>>> class obj():
...     def __init__(self,id):
...         self.id = id
...     def __repr__ (self):
...         return "obj({})".format(self.id)
...
>>> obj1 = obj(6)
>>> obj2 = obj(4)
>>> obj3 = obj(2)
>>> x = MyArray([obj1, obj2, obj3])
>>> x
MyArray({2: obj(2), 4: obj(4), 6: obj(6)})
>>> x[4]
obj(4) 
>>> x[2, 4]
MyArray({2: obj(2), 4: obj(4)})

3 Comments

Nice workaround, this was my original way of tackling the problem. However order does matter for me. So the x that was printed out should be 6,4,2
@snowleopard Then simply use collections.OrderedDict instead of dict, see my updated answer.
Is there a way to make the output always be MyArray (ie even for case where key is not iter), i tried and get infinite recursion

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.