Skip to content

load.wave

3 messages · Nick Wray, Jeff Newmiller, Ivan Krylov

#
I have been given a wav file of train locomotive noise - literally something you can play back and hear.  Using the audio package and the load.wave function I have got a 1.5 million element vector which visually at least has some periodicity in certain parts and does not seem to be completely random.  Most elements (99%) are within a range of about -0.14 to +0.14 with occasional outliers.  Beneath is a typical short segment.


This is the head:

sample rate: 16000Hz, mono, 16-bits
[1] -3.051851e-05  6.103516e-05 -6.103702e-05  3.051758e-05  3.051758e-05 -1.220740e-04

Most elements (99%) are within a range of about -0.14 to +0.14 with occasional outliers

This is the same kind of output as is illustrated in the documentation: 
https://cran.r-project.org/web/packages/seewave/vignettes/seewave_IO.pdf

What I am not sure about, and I can't find any clear explanation, is what these elements actually stand for? 
I would have thought that one needed as a minimum both volume and frequency ie a two dimensional vector but as far as I can tell
there is only one single vector.  I'm aware that this question is pushing the envelope of R help but...

Thanks, Nick Wray
1 day later
#
You aren't pushing any envelope... you slit it open and fell out somewhere on the sidewalk. I tossed your question into Google and it came back with [1] and [2]. Please do that yourself instead whenever you are tempted to go off topic.

[1] https://stackoverflow.com/questions/25940376/whats-the-actual-data-in-a-wav-file
[2] https://en.m.wikipedia.org/wiki/Digital_audio
On February 1, 2019 2:20:57 AM PST, Nick Wray via R-help <r-help at r-project.org> wrote:

  
    
#
Hello Nick Wray,

Let me offer a simplified explanation of what's going on. Sorry if it's
unnecessary.

Sound is waves of pressure in the air. Devices like microphones can
measure the changing pressure by converting it into voltage. Voltage
can then be sampled by an analog-to-digital converter inside a sound
card and stored as numbers in computer memory.

On Fri, 1 Feb 2019 10:20:57 +0000 (GMT)
Nick Wray via R-help <r-help at r-project.org> wrote:

            
Digital sound works by measuring "pressure" a few tens of thousands of
times per second and then recreating the corresponding signal
elsewhere. According to the sampling theorem, sound sampled N times per
second would be losslessly reproduced if it didn't contain frequencies
above N/2 Hz.

To reiterate, these numbers are just audio samples. Feed them to the
sound card at the original sample rate, and you hear the same sound
that had been recorded.

This part is explained well in two 30-minute video lectures here:
https://xiph.org/video/vid1.shtml https://xiph.org/video/vid2.shtml
(I wouldn't normally recommend video lectures, but these are really
good.)
You are describing a spectrogram: a surface showing the "volume" of
each individual frequency in the sound recording, over time. How to get
it? If you run a Fourier transform over the original vector, you will
get only one vector showing the magnitudes and phases of all frequencies
through the whole length of the clip.

To get a two-dimensional spectrogram, you should take overlapping parts
of the original vector of samples, multiply them by a special window
function, then take a Fourier transform over that and combine
resulting vectors into a matrix. Computing a spectrogram involves
choosing a lot of parameters: size of the overlapping window, step
between overlapping windows, the window function itself and its own
parameters.

Problems like these should be described in books about digital signal
processing.

Jeff Newmiller sent more useful links while I was typing this, and I
guess I should posting off-topic.