Hi,
here is some info about the first part of my "homework", for those, who want
to break down their signal (heart beat or whatever) into a collection of pure
sin waves to analyse "main" frequency magnitudes and phases.
First some very un-mathematical "applied" theory:
If you sample a waveform signal (heart beat pressure pulses, ECG, doppler flow
signals, etc.) with a certain data acquisition frequency, an fft of your data
gives you the decomposition/breakdown of the waveform signal into a series of
pure sin waves of different frequencies. Each sin wave in that list has:
a) a certain "magnitude", i.e. a measure of how much that particular frequency
participates in the generation of your signal, and
b) a phase, i.e. the starting point of each sin wave.
Two characteristics of an fft have to be considered:
1) the highest meaningful sin wave frequency of your fft-analysis of the
original waveform signal is half the data acquisition frequency (actually,
R's fft gives you a list of frequencies up to the acquisition frequency, but
you can only use the first half of it, see below)
2) the frequency resolution of your fft-analysis depends on the sampling time.
The longer the sampling/analysis interval, the finer the resolution.
Frequency resolution is actually 1 divided by sampling time (sec).
An example:
- some complicated waveform signal
- 1000 Hz data acquisition frequency (going on for hours)
- fft-analysis of data blocks of 1 sec length
Result:
- vector of frequencies from 1 to 500 Hz with a resolution of 1 Hz,
corresponding vector of magnitudes (one for each frequency) and phases
(dito).
You can now e.g. pick the frequency with the highest magnitude within that 1
sec block and continue the fft analysis in 1 sec blocks for the complete data
set, analysing the time course of the "main" frequency of your waveform
signal.
If you need higher frequency resolution, increase the block length. Analysis
of a 5 sec block will give you a list of frequencies from 0.2 to 500 Hz with
a resolution of 0.2 Hz. However, increasing analysis-block length decreases
temporal resolution, i.e. "main" frequency are now calculated only every 5
sec and not 1 sec.
What does R's fft() deliver?
fft() is calculated with a single one-dimensional vector. Information on data
acquisition frequency and block length (in sec or whatever) can not be
included into the fft()-call.
R delivers a single one-dimensional vector of the same length as the data
vector containing a list of imaginary numbers.
To extract the "magnitudes" use Mod(fft()).
The magnitudes can also be calculated using the formula:
magnitude = square root (real * real + imaginary * imaginary)
real: Re(fft()), imaginary: Im(fft())
Confusingly, if you calculate fft() on a sample vector consisting of 2 pure
sin frequencies, you get 4 peaks, not 2.
As stated above, fft() gives only "meaningful" frequency up to half the
sampling frequency. R, however, gives you frequencies up to the sampling
frequency. The point is, that sampling a signal in discret time intervals
causes aliasing problems. E.g. when sampling a 50 Hz sin wave and 950 Hz sin
wave with 1000 Hz, the results will be identical. An fft can not distinguish
between the two frequencies. Therefore, the sampling frequency should always
be at least twice as high as the expected signal frequency.
So for each actual frequency in the signal, fft() will give 2 peaks (one at
the "actual" frequency and one at sampling frequency minus "actual"
frequency), making the second half of the magnitude vector a mirror image of
the first half.
As long as the sampling frequency was at least twice as high as the expected
signal frequency, all "meaningful" information is contained in the the first
half of the magnitude vector. A peak in the low frequency range might
nevertheless still be caused by a high "noise" frequency.
The vector of magnitudes extraced so far only has an index an no associated
frequencies.
To calculated the frequencies, simply take (or generate) the index vector (1
to length(magnitude vector) and divide by the length of the data block (in
sec).
That's it for now. The second half of my "homework" will be delivered as soon
as I understand what to make out of the phases given by R.
I again would expect a vector of the same length as the magnitude vector with
the phases (0 to 2*pi or -pi to +pi) of each frequency. However, I do not
know yet what R calculates.
I would be most obliged for any comments and help.
Wolfgang
---------------------------------------------------------------------------
# R-script
acq.freq <- 4000 # data acquisition frequency (Hz)
sig1.freq <- 50 # frequency of 1st signal component (Hz)
sig2.freq <- 130 # frequency of 2nd signal component (Hz)
time <- 5 # measuring time interval (s)
# vector of sampling time-points (s)
smpl.int <- (1:(time*acq.freq))/acq.freq
# data vector containing two frequencies (2nd frequ with phase shift)
data <- sin(sig1.freq*smpl.int*2*pi)+sin(sig2.freq*smpl.int*2*pi+pi/2)
plot(data,type="l")
# calculate fft of data
test <- fft(data)
# extract magnitudes and phases
magn <- Mod(test) # sqrt(Re(test)*Re(test)+Im(test)*Im(test))
phase <- Arg(test) # atan(Im(test)/Re(test))
# select only first half of vectors
magn.1 <- magn[1:(length(magn)/2)]
#phase.1 <- Arg(test)[1:(length(test)/2)]
# plot various vectors
# plot magnitudes as analyses by R
x11()
plot(magn,type="l")
# plot first half of magnitude vector
x11()
plot(magn.1,type="l")
# generate x-axis with frequencies
x.axis <- 1:length(magn.1)/time
# plot magnitudes against frequencies
x11()
plot(x=x.axis,y=magn.1,type="l")
Hi,
I'm in dire need of a fast fourier transformation for me
stupid biologist, i.e. I have a heartbeat signal and
would like to decompose it into pure sin waves, getting
three vectors, one containing the frequencies of the sin
waves, one the magnitudes and one the phases (that's what
I get from my data acquisition software's FFT function).
I'd be very much obliged, if someone could point out
which command would do the job in R.
fft(), but notice that it gives the complex
transform. You need to do a little homework to get at
the magnitude/phase values. (Basically, you just have to
take Mod() and Arg(), but there some conventions about
the frequencies and multipliers that one can get wrong).
Once you've finished the "homework", others might be interested
in your result... so it will be found in the future using
RSiteSearch().
Dr. Wolfgang Waser
Humbolt-Universit??t zu Berlin
Institute of Biology
Department of Animal Physiology
Philippstrasse 13, Abderhaldenhaus
10115 Berlin
Germany
Tel: +49 (0)30 2093 6173
Fax: +49 (0)30 2093 6375
"Wolfgang" == Wolfgang Waser <wolfgang.waser at rz.hu-berlin.de> writes:
> I would be most obliged for any comments and help.
Wolfgang,
I've used R's fft to filter ECG signals and will comment on your
commentary based on my experience. First, as an easily
accessible reference, I suggest "The Scientist and Engineer's
Guide to Digital Signal Processing," which is available in pdf
form at http://www.dspguide.com. It includes several chapters on
the discrete Fourier transform and the fast Fourier transform
algorithm (which is what R's fft implements) and a chapter on
applications that contains info on spectral analysis.
> What does R's fft() deliver?
> fft() is calculated with a single one-dimensional
> vector. Information on data acquisition frequency and block
> length (in sec or whatever) can not be included into the
> fft()-call.
> Confusingly, if you calculate fft() on a sample vector
> consisting of 2 pure sin frequencies, you get 4 peaks, not
> 2.
That is the nature of the fft algorithm. It returns values of
the discrete Fourier transform for both positive and negative
frequencies.
> As stated above, fft() gives only "meaningful" frequency up
> to half the sampling frequency. R, however, gives you
> frequencies up to the sampling frequency.
It is important to remember that the fft algorithm doesn't return
any frequency data at all. It returns values of the fft that
correspond to frequencies from -f_Nyquist to +f_Nyquist. It is
up to the user to calculate the frequency values.
Here's an example:
## Read some sample ecg data
ecg <- read.table('http://www.indyrad.iupui.edu/public/mmiller3/sample-ecg-1kHz.txt')
names(ecg) <- c('t','ecg')
ecg$t <- ecg$t/1000 # convert from ms to s
par(mfrow=c(2,2))
## Plot the ecg:
plot(ecg ~ t, data=ecg, type='l', main='ECG data sampled at 1 kHz', xlab='Time [s]')
## Calculate fft(ecg):
ecg$fft <- fft(ecg$ecg)
## Plot fft(ecg):
#plot(ecg$fft, type='l')
## Plot Mod(fft(ecg)):
plot(Mod(ecg$fft), type='l', log='y', main='FFT of ecg vs index')
## Find the sample period:
delta <- ecg$t[2] - ecg$t[1]
## Calculate the Nyquist frequency:
f.Nyquist <- 1 / 2 / delta
## Calculate the frequencies. (Since ecg$t is in seconds, delta
## is in seconds, f.Nyquist is in Hz and ecg$freq is in Hz)
## (Note: I may be off by 1 in indexing here ????)
ecg$freq <- f.Nyquist*c(seq(nrow(ecg)/2), -rev(seq(nrow(ecg)/2)))/(nrow(ecg)/2)
## Plot fft vs frequency
plot(Mod(fft) ~ freq, data=ecg, type='l', log='y', main='FFT of ECG vs frequency', xlab='Frequency [Hz]')
## Now let's look at some artificial data:
x <- seq(100000)/1000 # pretend we're sampling at 1 kHz
## We'll put in two frequency components, plus a dc offset
f1 <- 5 # Hz
f2 <- 2 # Hz
y <- 0.1*sin(2*pi*f1*x) + sin(2*pi*f2*x) + 50
fft.y <- fft(y)
delta <- x[2] - x[1]
f.Nyquist <- 1 / 2 / delta
f <- f.Nyquist*c(seq(length(x)/2), -rev(seq(length(x)/2)))/(length(x)/2)
par(mfrow=c(2,2))
plot(x,y, type='l', xlim=c(0,20))
plot(f, Mod(fft.y), type='l', log='y')
## Now let's zoom in and mark the points were I expect to see peaks:
plot(f, Mod(fft.y), type='l', log='y', xlim=c(-10,10))
rug(c(-f1, -f2, 0, f1, f2), col='red', side=3)
Hope this is helpful, Mike
Michael A. Miller mmiller3 at iupui.edu
Imaging Sciences, Department of Radiology, IU School of Medicine