Musical instrument sounds and basic waveforms (except sine) contain many sine waves (click for graphing calculator)
Decomposes a signal from the time domain (waveform) into the frequency domain (spectrum):
→ Any complex sound can be represented as a sum of sine waves.
Fourier series of a saw tooth wave (approximation)
Fourier transform of a sawtooth wave
A time-domain signal
→ The spectrum represents the amplitude and phase of each frequency component.
Fourier transform (analysis formula):
Inverse transform (synthesis formula):
A complex exponential combines cosine and sine into a single expression representing sinusoidal components:
where
→ This representation is fundamental to the Fourier transform, allowing efficient encoding of both amplitude and phase.
For digital audio, the DFT is used, which operates on discrete, finite-duration signals by replacing the continuous integral with a finite sum. The Fast Fourier Transform (FFT) is an efficient algorithm for computing the DFT.
inverse DFT:
→ DFT is simpler and more computationally relevant than FT.
The FFT is an efficient algorithm for computing the DFT, reducing computational complexity from
This allows efficient computation for:
→ The FFT is an algorithmic optimization of the DFT computation.
The FFT produces discrete frequency values called bins, each representing a specific frequency component. Each bin contains amplitude and phase information for its frequency component.
Frequency of bin
where
For real signals: Number of bins =
Frequency resolution (bin spacing) determines how finely the spectrum is divided:
where
Sampling rate: 44.1 kHz
FFT 1024-sample
This means frequencies within a frequency band 43 Hz fall in the same bin and cannot be distinguished.
Tapering function that smoothly reduces signal amplitude to zero at analysis window boundaries, minimizing discontinuities and spectral leakage.
→ Trade-off: Reduced leakage vs. reduced frequency resolution.
Spectral leakage occurs when the analysis window doesn't contain an exact integer number of wave cycles.
→ A window function tapers the signal smoothly to zero at the edges.
Each window function has a characteristic frequency response with a main lobe and side lobes:
Main Lobe:
Side Lobes:
Trade-off:
Rectangular (-13 dB side lobes) vs. Hann (-31 dB side lobes)
| Window | Main lobe width | Side lobe level | Use case |
|---|---|---|---|
| Rectangular (no window) |
Narrowest (2 bins) |
Highest (-13 dB) |
Maximum frequency resolution, integer number of periods |
| Hann | Medium (4 bins) |
-31 dB | General purpose, good balance of resolution and leakage |
| Hamming | Medium (4 bins) |
-42 dB | Better side lobe suppression, 8-bit systems, telephony |
| Blackman-Harris | Widest (6 bins) |
-92 dB (4-term) |
High dynamic range, very low leakage critical applications |
Trade-off: Better side lobe suppression = wider main lobe = reduced frequency resolution.
Number of samples per FFT computation. Determines the time-frequency resolution trade-off:
Common sizes: 256, 512, 1024, 2048, 4096 (powers of 2)
| FFT Size | Frequency Resolution | Time Resolution |
|---|---|---|
| Small (256) | Poor (coarse bins) | Good (fast response) |
| Large (4096) | Good (fine bins) | Poor (slow response) |
Theoretical approaches:
Spectral audio signal processing:
A time-varying visual representation of a signal's frequency content:
Spectrograms reveal temporal evolution of spectral content. Helpful for analyzing speech, music, and environmental sounds.
Original content: © 2025 Lorenz Schwarz
Licensed under CC BY 4.0. Attribution required for all reuse.
Includes: text, diagrams, illustrations, photos, videos, and audio.
Third-party materials: Copyright respective owners, educational use.
Contact: lschwarz@hfg-karlsruhe.de
The FFT of a real signal produces mirrored results above the Nyquist frequency