0

I have two dataset arrays, A and B. They are two different, independent measurements (e.g. smell and color of some object).

For each data entry in A and B, I have a time, t, and a location, p of the measurement. The majority of the smell and color measurements were taken at the same time and location. However, there are some times where data is missing (i.e. at some time there was no color measurement and only a smell measurement). Similarly, there are some locations where some data is missing (i.e. at some location there was only color measurement and no smell measurement).

I want to build arrays of A and B which have the same size where each row corresponds to a full set of all times and each column corresponds to a full set of all locations. If there is data missing, I want that entry to be NaN.

Below is an example of what I want to do:

%Inputs    
A = [0 0 1 2 4; 1 1 3 3 2; 4 4 1 0 3];
t_A = [0.03 1.6 3.9]; %Times when A was measured (rows of A)

L_A = [1.0 2.9 2.98 4.2 6.33]; %Locations where A was measured (columns of A)

B = [10 13 10 10; 15 13 13 12; 14 14 13 12; 15 19 11 13];
t_B = [0.03 1.6 1.9 3.9]; %Times when B was measured (rows of B)
L_B = [2.1 2.9 2.98 5.0]; %Locations where B was measured (columns of B)

What I want is some code to transform these datasets into the following:

t = [0.03 1.6 1.9 3.9];
L = [1.0 2.1 2.9 2.98 4.2 5.0 6.33];

A_new = [0 NaN 0 1 2 NaN 4; 1 NaN 1 3 3 NaN 2; NaN NaN NaN NaN NaN NaN NaN;  4 NaN 4 1 0 NaN 3];

B_new = [NaN 10 13 10 NaN 10 NaN; NaN 15 13 13 NaN 12 NaN; NaN 14 14 13 NaN 12 NaN; NaN 15 19 11 NaN 13 NaN];

The new arrays, A_new and B_new, are the same size and the vectors t and L (corresponding to the rows and columns) are sequential. The original A had no data at t = 1.9 and thus at the 3rd row in A_new, there is all NaN values. Similarly for the columns 2 and 6 in A_new and columns 1, 5 and 7 in B_new.

How can I do this in MATLAB quickly for a large dataset?

1
  • 1
    I am assuming the second reference of L_A (last line) should be L_B right? Commented Jun 11, 2018 at 19:23

1 Answer 1

1

Create a matrix of NaNs , use third output of the unique function to convert floating numbers to integer indexes and use matrix indexing to fill the matrices:

[t,~,it] = unique([t_A t_B]);
[L,~,iL] = unique([L_A L_B]);

A_new = NaN(numel(t),numel(L));
A_new(it(1:numel(t_A)),iL(1:numel(L_A))) = A;


B_new = NaN(numel(t),numel(L));
B_new(it(numel(t_A)+1:end),iL(numel(L_A)+1:end)) = B;
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for the response. I'm not sure if this will work for my case though. The example I gave in my question is perhaps overly simplistic in that the t and L vectors are sequential integers whereas in my real case, the t and L vectors are not. I've modified my example. Is there a way to make your method more general?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.