0

I have the following 3x3x3 (3 rows, 3 columns with 3 elements in each cell) numpy array...

[[[1, 1, 19],
  [2, 2, 29],
  [3, 3, 39]],
 [[4, 4, 49],
  [1, 1, 19],
  [2, 2, 29]],
 [[3, 3, 39],
  [9, 9, 99],
  [8, 8, 89]]]

and the following pandas dataframe...

col0 col1 col2 col3
1    1    19    10
2    2    29    20
3    3    39    30
4    4    49    40
8    8    89    80
9    9    99    90

I want to generate a new pandas data frame using values from col3, that matches each 3 element array (e.g. [1, 1, 19] or [4, 4. 49]) with col0, col1, col3.

Order of the 3 element array is important, the first element must match to col0, and second to col1 and so on.

The resulting data frame would look like the following...

colA colB colC
10   20   30
40   10   20
30   90   80
2
  • 1
    Your desired output contains the number 80 which appears nowhere in the input. Can you clarify how the input maps to the output? Commented Feb 14, 2018 at 14:33
  • I was trying to imply that original data frame followed a pattern, but I have changed it to be explicit. Commented Feb 14, 2018 at 14:37

1 Answer 1

0

Call the array needles and the DataFrame haystack. First, index the haystack:

haystack.set_index(['col0', 'col1', 'col2'], inplace=True)

Now you can get the values for the first set of needles:

haystack.loc[list(map(tuple, needles[0]))]

This gives you the first row of your solution (in col3):

                col3
col0 col1 col2      
1    1    19      10
2    2    29      20
3    3    39      30

Finally, do that for every 3x3 array along the first axis of needles:

pd.DataFrame(haystack.loc[list(map(tuple, pin))].col3.values for pin in needles)

This gives you the result:

    0   1   2
0  10  20  30
1  40  10  20
2  30  90  80

An alternative which may or may not be faster:

pd.DataFrame(haystack.col3[pd.MultiIndex.from_arrays(pin.T)].values for pin in needles)

The map or MultiIndex.from_arrays() is needed because unfortunately Pandas doesn't allow MultiIndex lookups by 2D arrays--only by lists (or arrays) of tuples. For more on that, see: Pandas MultiIndex lookup with Numpy arrays

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.