Look at this:
It is 1 second of pink-noise. A the top you can see the waveform, at the bottom a very confusional image, its spectrogram.
Now look at this:
It's the same waveform zoom in at 12 millisecond; you note the irregularity of wave and its spectrogram.
An other picture:
This is 1 second sawtooth at 1500 Hz (or CPS, Cycles Per Second); it means 1500 crests waveform in 1 second. In this image you can't see them because 1500 crests in few centimeters are very constrict, so:
The same waveform with zoom in. Now you can see the crests (crest= section of the wave that rises above the 0 line position); you can count 15 of them in 10 milliseconds, 1500 Hz : 1 second = 15 Hz : 10 milliseconds.
Here the spectrogram represents a series of lines, the harmonics, each one at the same distance.
In this way you can understand that:
1 - noise hasn't got an exact frequency, its spectrogram is irregular, this sound is enharmonics, its crests are irregular
2 - frequency (Hz) represents the repetitions number of crest in 1 second; frequency is closely connected to the pitch of a sound.
3 - if a wave is periodic (regular repetion of crest), probably has got harmonic components.
4 - if a wave is not periodic, it has got enharmonic components (for example gong, noise, handmade bells).
so:
5 - a periodic wave produce a pitch sensation.
6 - the pitch sensation is not so definite if the wave periodicity is less regular.
7 - harmonics are all multiples of fundamental frequency (called F0 or h1 or 1° harmonic component)
A nice aspect is the relation among octave intervals; now we consider an harmonic sound at 100 Hz, it is create by:
h1 (1° harmonic component) or
F0 (fundamental frequency) = 100 Hz
h2 = 200 Hz
h3 = 300 Hz
and so on, but the 1° octave is 200 Hz, the 2° octave is 200*2=400 Hz, the 3° octave is 800 Hz, the 4° octave is 1600 Hz.
These are all octave intervals, 100 Hz and 200 Hz, but 800 Hz and 1600 Hz too (or 10.000 Hz and 20.000 Hz).
What does it means?
Our auditory perception rates the
ratio among frequencies, not the frequencies.
So we always perceive the same interval only if the ratio among frequencies is steady (in this case ratio 2:1).
FFT - Fast Fourier Transform
By this theorem, each audio signal can be split in many harmonic components, each one with its frequency, amplitude and phase, as show in the upper images.
So we understand that each natural harmonic sound is the amount of many overtones, and the relation among their intensity fix our timbre perception.
There is a problem: if it is true, I could re-add harmonics to obtain the original sound, but the result can't be the same, because each natural sound is living and develop itself through time. For example it is impossible reproduce the sound of piano only re-adding its harmonic components because thay can't reproduces the hammers sound, and second after second each harmonic can turn its amplitude or lenght.
The same things about voice.
By spettrograms and sonograms we can analyze all the harmonic or enharmonic components of a sound, but that's not all. There are formants too.
Very important: harmonic components, or overtones, are sine waves (pure sounds)
to be continued...