1

So I have some irregular, multi-dimensional data that I'd like to be able to index by the 'age' and 'Z' value.

For each 'age', and 'Z' I have an array of 100 wavelengths and assoc'd fluxes (ex data):

age    = np.array([10,20,30,40,50])
Z      = np.array([7,8])
waveln = np.array([np.array([a for a in arange(100)]) for b in arange(2*5)])
flux   = np.array([np.array([a*10 for a in arange(100)]) for b in arange(2*5)])

SO in this example, waveln[0] (an array of 100 entries) and flux[0] would get assoc'd with

myData['age' = 10, 'Z' = 7]['waveln'] # which I want to return the waveln array

and something like

myData['age' = 10, 'Z' = 7]['flux'] # which I want to return the flux array

how should I set this up?? The problem is, age and Z are both floats...

Thx,

1
  • 1
    It sounds like you might want to looking for one of the data structures available in the pandas library. pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe Hierarchically indexed dataframes can act like multidimensional data structures. pandas is built on top of numpy so it you should be able to ease into it if you are unfamiliar. Commented Jun 21, 2016 at 1:27

1 Answer 1

2

Do you realize the waveln is a 10x100 2d array, not an array of arrays? You could construct the same with

np.repeat(np.arange(100)[None,:],10,axis=0)

If you really want waveln to be a 1d array containing 10 arrays, you'll have to use an alternative object dtype construction.

As defined flux=waveln*10, though I suspect that is just illustrative values.

But let's define waveln so it is more interesting - so each row is different

In [983]: waveln=np.arange(10)[:,None]+np.arange(100)[None,:]

I can construct an indexing tuple with np.ix_ from your Z and age arrays:

In [984]: np.ix_(Z,age)
Out[984]: 
(array([[7],
        [8]]), array([[10, 20, 30, 40, 50]]))

In [985]: waveln[np.ix_(Z,age)]
Out[985]: 
array([[17, 27, 37, 47, 57],
       [18, 28, 38, 48, 58]])

So this has selected 2 rows, and 5 columns from that.

To do myData['age' = 10, 'Z' = 7]['waveln'], I'd create a class with a __getitem__ method. Python converts expressions in [] to a tuple which is passed to this method. But it would choke on that = syntax. You can't use keyword arguments in an indexing expression. Correct dictionary syntax is {'age':17, 'Z':7} or dict(age=16, Z=12).

Study the /numpy/lib/index_tricks.py file where ix_ is defined to get ideas on how to construct a custom class.

myData[age = 10, Z = 7, var = 'waveln') lets you use straight function definitions.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.