0

I am implementing the method from this paper: https://dspace.mit.edu/bitstream/handle/1721.1/66243/Picard_Noncontact%20Automated.pdf?sequence=1&isAllowed=y

The main idea is cardiac pulse measurement using a set of frames (N=300) from a 10s video so the frame rate equals 30 fps.

red = [item[:,:,0] for item in imgs]
green = [item[:,:,1] for item in imgs]
blue = [item[:,:,2] for item in imgs]

red_avg = [item.mean() for item in red]
green_avg = [item.mean() for item in green]
blue_avg = [item.mean() for item in blue]

red_mean, red_std = np.array(red_avg).mean(), np.array(red_avg).std()
green_mean, green_std = np.array(green_avg).mean(), np.array(green_avg).std()
blue_mean, blue_std = np.array(blue_avg).mean(), np.array(blue_avg).std()

red_avg = [(item - red_mean)/red_std for item in red_avg]
green_avg = [(item - green_mean)/green_std for item in green_avg]
blue_avg = [(item - blue_mean)/blue_std for item in blue_avg]

data = np.vstack([signal.detrend(red_avg), signal.detrend(green_avg), signal.detrend(blue_avg)]).reshape(300,3)
from sklearn.decomposition import FastICA
transformer = FastICA(n_components=3)
X_transformed = transformer.fit_transform(data)

from scipy.fftpack import fft

first = X_transformed.T[0]
second = X_transformed.T[1]
third = X_transformed.T[2]

ff = np.fft.fft(first)
fs = np.fft.fft(second)
ft = np.fft.fft(third)

imgs - is the initial list of arrays with 300 image pixel values. As you can see, I split all frames into RGB channels and thus have traces x_i(t), where i = 1,2,3

After standardization, we detrend all the traces and stack them to further apply ICA and then FFT all the three components.

The method then claims that we need to plot power vs frequency (Hz) and select the component that is most likely to be heart pulse.

Finally, we applied the fast Fourier transform (FFT) on the selected source signal to obtain the power spectrum. The pulse frequency was designated as the frequency that corresponded to the highest power of the spectrum within an operational frequency band. For our experiments, we set the operational range to [0.75, 4] Hz (corresponding to [45, 240] bpm) to provide a wide range of heart rate measurements.

Here's how I try to visualize the frequencies:

import numpy as np
import matplotlib.pyplot as plt
from scipy import signal

data = ft
print(fs.size)
ps = np.abs(np.fft.fft(data))**2

sampling_rate = 30

freqs = np.fft.fftfreq(data.size, 1/sampling_rate)
idx = np.argsort(freqs)
#print(idx)
plt.plot(freqs[idx], ps[idx])

What I get is totally different since the range of frequencies is from $-15$ to $15$ and I have no idea whether this is in Hz or not.

Method idea

Graphs in the article

First component

Second component

Third component

The three images above are what I get when I execute the code to visualize frequencies and signal power.

I would appreciate any help or suggestions.

1 Answer 1

2

You should really learn how to work with images/videos as nD-tensors. Doing that you can replace all your data wrangling with much more concise code:

import numpy as np
import matplotlib.pyplot as plt
from scipy.fftpack import fft
from sklearn.decomposition import FastICA

images = [np.random.rand(640, 480, 3) for _ in range(30)]

# Create tensor with all images
images = np.array(images)
images.shape

# Take average of all pixels, for each image and each channel individually
avgs = np.mean(images, axis=(1, 2))
mean, std = np.mean(avgs), np.std(avgs)

# Normalize all average channels
avgs = (avgs - mean) / std

# Detrend across images
avgs = scipy.signal.detrend(avgs, axis=0)

transformer = FastICA(n_components=3)
X_transformed = transformer.fit_transform(avgs)

X_ff = np.fft.fft(X_transformed, axis=0)
plt.plot(np.abs(X_ff) ** 2)

To answer your question a little bit: I think you are mistakenly taking the fourier spectrum of the fourier spectrum of the third PCA component.

FFT(FFT(PCA[:, 2]))

while you meant to take one FFT only:

FFT(PCA[:, 2]))

Regarding the -15...15 axis: You have set your sampling frequency as 30Hz (or 30fps in video-terms). That means you can detect anything up to 15Hz in your video.

In fourier theory, there exists a thing as called "negative frequency". Now since we're mostly analyzing real signals (as opposed to complex signals), that negative frequency is always the same as the positive frequency. This means your spectrum is always symmetric and you can disregard the left half.

However since you've taken the FFT twice, you're looking at the FFT of a complex signal, which does have negative frequencies. That's the reason why your spectra are asymmetric and confusing.

Additionally, I believe you're confusing reshaping and transposing. Before PCA, you're assembling your data like

np.vstack([red, green, blue])  # shape = (3, 300)

which you want to transpose to get (300, 3). If you reshape instead, you're not swapping rows and columns but are interpreting the same data in a different shape.

Sign up to request clarification or add additional context in comments.

9 Comments

Thank you very much for you detailed comment. Can you please elaborate a bit on what you mean by "took the FFT" twice? The model from that paper suggests applying FFT after ICA to the tree trace-components. So, as far as I understand, I only apply FFT once. Another question is how I should interpret the frequencies. I don't quite understand this phrase from the article: The pulse frequency was designated as the frequency that corresponded to the highest power of the spectrum within an operational frequency band. For our experiments, we set the operational range to [0.75, 4] Hz
Look at the bottom of your first snippet and the top of your second snippet. You're calling fft() twice.
I believe np.vstack().reshape(300, 3) does not do what you think it does. You should do np.vstack().T instead.
In your case it is Hz, yes. You're sampling your signal at 30Hz and you're generating an axis accordingly. The 4Hz come from the knowledge about human bodies: a heartrate below 0.7 or above 4Hz (42-210 bpm) are not to be expected in healthy participants, so it can be disregarded. Just crop your spectrum accordingly.
No, just throw away frequencies and data outside that range.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.