I have a question very similar to this post. Essentially I start with 2 2d arrays(of possibly different width), with a bunch of rows where the leftmost column acts as an effective index and I would like to combine the two arrays (unlike in the original post we can assume the leftmost column is already in ascending order)
a = np.array([[1,2], [5,0], [6,4]])
b = np.array([[1,10], [5,20], [6,30]])
would be merged into this
[[1 2 10]
[5 0 20]
[6 4 30]]
As in the original port. However, there are two new things I would like to do. First I'd like to match the two arrays by the leftmost value deleting any rows that don't have a matching value on the other array. As an example,
a = np.array([[1,2],[3,2], [5,0], [6,4]])
b = np.array([[1,10],[6,30], [5,20], [7,80]])
would still be
[[1 2 10]
[5 0 20]
[6 4 30]]
As [3,2] from array a and [7,80] would be ignored on array b. Second, as a seperate function I'd like to join these two arrays similarly, but whenever a matching value cannot be found I'd like to create a new row with np.nan (or some other unique non-numerical filler)
[[1 2 10]
[3 2 np.nan]
[5 0 20]
[6 4 30]
[7 np.nan 80]]
I have two programs that do these things but they are not efficient, as they iterate over each row of the input arrays (of possibly different width), effectively 'zipping' the rows together by case.
Are there good efficient ways to do this with builtin numpy functions?