1

I want to concatenate arrays of different dimensions to feed them to my neural network that will have as first layer the AdaptiveAveragePooling1d. I have a dataset that is composed of several signals (1D arrays), each one with a different length. For example:

array1 = np.random.randn(1200,1)
array2 = np.random.randn(950,1)
array3 = np.random.randn(1000,1)

I want to obtain a tensor in which I concatenate these three signals to obtain a 2D tensor. However if I try to do

tensor = torch.Tensor([array1, array2, array3])

It gives me this error:

ValueError: expected sequence of length 1200 at dim 2 (got 950)

Is there a way to obtain such thing?

EDIT More information about the dataset:

  • Each signal window represents a heart beat on the ECG registration, taken from several patients, sampled with a sampling frequency of 1000Hz
  • The beats can have different lengths, because it depends on the heart rate of the patient itself
  • For each beat I need to predict the length of the QRS interval (the target of the network) that I have, expressed in milliseconds
  • I have already thought of interpolating the shortest samples to the the length of the longest ones, but then I would also have to change the length of the QRS interval in the labels, is that right?

I have read of this AdaptiveAveragePooling1d layer, that would allow me to input the network with samples of different sizes. But my problem is how do I input the network a dataset in which each sample has a different length? How do I group them without using a filling method with NaNs or zeros? I hope I explained myself.

1 Answer 1

1

This disobeys the definition of a tensor and is impossible. If a tensor is of shape (NxMx1), all of the N matrices must be of size (Mx1).

There are still ways to get all your arrays to the same length. Look at where your data is coming from and what its structure is and figure out which of the following solutions would work. Some of these may change the signal's derivative in a way you don't like

  • Cropping arrays to the same size (ie cutting start/end off) or zero padding the shorter ones to the length of the longer one (I really dislike this one and it would only work for very specific applications)
  • 'Stretching' the arrays to the same size by using interpolation
  • Shortening the arrays to the same size by subsampling
  • For some applications, maybe even passing the coefficients of a fourier series from the signals

EDIT For heart rate, which should be a roughly periodic signal, I'd definitely crop the signal which should work quite well. Passing FFT(equally cropped signals) or Fourier coefficients may also yield interesting results, but from my experience with neural spike data, training on the FFT of a signal like this doesn't perform any better when you have enough data to train off.

Also if you're using a fully connected network, a using 1D convolutions is a good alternative to try.

Sign up to request clarification or add additional context in comments.

2 Comments

Could you edit your question to indicate what the data represents? That makes it a lot easier to answer.
I edited the question with all the information you required. I hope it is useful, thank you for your help.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.