1
import datetime
import shlex

import numpy as np
from scipy.fftpack import rfft, irfft, fftfreq

strWeatherFile = "data/weather.dat"

signal = []

with open(strWeatherFile, 'rt') as f:
    for line in f:
        arrDat = shlex.split(line)
        d = float(arrDat[3])
        if d < -30:  # if it's dummy data
            d = signal[len(signal) - 1]
        signal.append(d)

size = len(signal)

time   = np.linspace(1,size,size)

f_signal = rfft(signal)


import pylab as plt
plt.subplot(221)
plt.plot(time,signal)
plt.subplot(222)
plt.plot(time,f_signal)
plt.xlim(0,size)

plt.show()

This is weather data, so it is supposed to show spike at freq:365. But the result doesn't like what expected. Is there anything wrong with the code?

data is from below link: http://academic.udayton.edu/kissock/http/Weather/gsod95-current/CALOSANG.txt

0

3 Answers 3

2

The data file contains 14 bad samples with values of -99. I replaced the bad samples with linearly interpolated values using the nearest good samples. The total number of samples in the file is 7006. The data is for an unknown daily weather parameter.

As Ben said, the frequency units are 1/Days not Days, so you will not see a peak at 365 frequency units (assuming the data is periodic with period 365 days).

The graph below shows the data input to the FFT. The bad samples have been removed using the linear interpolation procedure. The zero values at the right end are added automatically by the FFT, in order to pad the number of input samples to the next higher power-of-2.

Time Domain graph - sooeet.com FFT calculator

The graph below shows the data input to the FFT, with the DC offset (62.3127) removed.

Time Domain graph, DC offset removed - sooeet.com FFT calculator

The graph below shows the FFT output at full scale. The FFT input includes the DC offset.

FFT spectrum graph - sooeet.com FFT calculator

The graph below zooms into the low frequency end of the FFT output. A peak is visible near the left side of the graph. That peak corresponds to the frequency you were looking for. The FFT input includes the DC offset.

FFT spectrum graph zoomed - sooeet.com FFT calculator

The graph below shows the frequency peak of interest. The peak is at 0.0026855 (22/8192) frequency units, which corresponds to a time value of 372 days (1/0.0026855). The FFT input includes the DC offset.

FFT spectrum peak - sooeet.com FFT calculator

The graph below shows the FFT after the input data was windowed with the Blackman-Harris 92 dB high resolution window. The frequency peak of interest is revealed 18 dB above the surrounding background. Windowing in this case markedly reduces spectral leakage. After windowing the peak remains at 0.0026855 (22/8192) frequency units, which corresponds to a time value of 372 days (1/0.0026855). The FFT input includes the DC offset.

FFT window function applied - spectrum peak - sooeet.com FFT calculator

The graph below shows the FFT after removing the DC offset (62.3127) from the input data, and no windowing applied to the input data.

FFT spectrum, DC offset removed - sooeet.com FFT calculator

The graph below shows the FFT after removing the DC offset (62.3127) from the input data, and applying the Blackman-Harris 92 dB high resolution window.

FFT spectrum, DC offset removed, window function applied - sooeet.com FFT calculator

FFT and graphs were done with the Sooeet FFT calculator

Sign up to request clarification or add additional context in comments.

Comments

2

Your FFT should be plotted against frequency, not time.

import shlex
import pylab as plt
import numpy as np
from scipy.fftpack import rfft, rfftfreq

strWeatherFile = "data/weather.dat"

signal = []

with open(strWeatherFile, 'rt') as f:
    for line in f:
        arrDat = shlex.split(line)
        d = float(arrDat[3])
        if d < -30:  # if it's dummy data
            d = signal[len(signal) - 1]
        signal.append(d)

size = len(signal)

time = np.arange(1,size+1)

f_signal = rfft(signal)
freq = rfftfreq(size, d=1.)


plt.ion()
plt.close("all")
plt.figure()
plt.plot(time,signal)
plt.figure()
plt.plot(freq,abs(f_signal))
plt.show()

You wouldn't expect to see a spike at 365 because that's the time value. The corresponding frequency to a signal with a periodicity of 365 would be 1/365 or .00274

3 Comments

I don't think this is a full answer, but it probably helps, and it does more than mine.
@AaronHall Please let me know if there's anything I should expand on
Thanks, guys! I figured my problem. Before I didn't understand that rfftfreq function in python. I even thought that was useless.
0

I don't necessarily have a good answer here, but I wanted to make your code easy to replicate for others, so they don't have to fiddle with a manual download and files:

import datetime
import urllib2
import numpy as np    
from scipy.fftpack import rfft, irfft, fftfreq    

url = 'http://academic.udayton.edu/kissock/http/Weather/gsod95-current/CALOSANG.txt'
response = urllib2.urlopen(url)
html = response.read()
rows = [[float(c) if '.' in c else int(c)
         for c in row.split()] 
         for row in html.splitlines()]

signal = []
for line in rows:
    d = float(line[3])
    if d < -30:  # if it's dummy data
        d = signal[len(signal) - 1]
    signal.append(d)

size = len(signal)
time   = np.linspace(1,size,size)
f_signal = rfft(signal)
import pylab as plt
plt.subplot(221)
plt.plot(time,signal)
plt.subplot(222)
plt.plot(time,f_signal)
plt.xlim(0,size)
plt.show()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.