Methodology — How Our Pitch Detector Works
This page provides a technical overview of the algorithms and processes our pitch detector uses. We prioritize a privacy-first approach, meaning all audio analysis happens directly in your browser. No audio is ever recorded or uploaded to our servers. Our methodology is designed to balance real-time responsiveness with the stability and precision needed for effective musical practice. We use a combination of proven signal processing techniques to deliver accurate, reliable results you can trust.
Updated on {Month Day, Year}
Signal chain
From microphone to display, your audio is processed through the following pipeline:
- Input Capture: The browser’s Web Audio API captures microphone input as a raw Float32 audio stream.
- Framing: The continuous audio stream is segmented into small, overlapping frames for analysis.
- Windowing: Each frame is multiplied by a Hann window function to reduce spectral leakage.
- Autocorrelation: The core algorithm computes the autocorrelation function (ACF) to find the most likely pitch period.
- Parabolic Peak Interpolation: The peak of the ACF is refined with a parabolic fit for sub-bin precision.
- Post-smoothing: A median filter is applied to a buffer of recent pitch estimates to reduce jitter.
- Note Mapping: The final frequency is converted to the nearest musical note based on the selected A4 reference.
- Cents Calculation: The precise deviation from the target note is calculated in cents and displayed.
This entire chain operates under a monophonic assumption, analyzing one note at a time.
Core algorithm
The heart of our tool is a time-domain algorithm using **autocorrelation** for its robustness, refined with **parabolic peak interpolation** for precision.
Autocorrelation
We compute the correlation of a signal frame with itself at different time lags. The lag that produces the highest correlation corresponds to the fundamental period of the signal. We search a lag range equivalent to common musical frequencies (approx. 55-1000 Hz) to find the strongest candidate peak, which is effective at rejecting overtones.
Peak Refinement
The resolution of the raw autocorrelation is limited by the sample rate. To achieve sub-sample accuracy, we fit a parabola to the three points around the detected maximum peak. The true maximum of this parabola gives us a highly precise, fractional lag value, which translates into a more accurate final frequency measurement.
We use a frame size of 2048 samples with a 75% overlap (hop size of 512). This provides a good balance between frequency resolution for low notes and the low-latency responsiveness needed for real-time feedback.
From frequency to musical note
Once a stable frequency is detected, it is converted into a musical note and cents deviation using the 12-tone equal temperament (12-TET) system, relative to the A4 reference pitch. The cents value is calculated with the following formula:
cents = 1200 * log2(f_measured / f_reference_note)
A cents value near 0 is perfectly “in tune.” The visual meter shows a range of ±50 cents, which represents the boundary halfway to the next semitone.
Calibration
The standard for modern tuning is A4 = 440 Hz, which is our tool’s default setting. However, different musical contexts may require alternate calibrations. We provide an option to switch the A4 reference to 442 Hz, a common standard for many European orchestras. Changing this reference pitch shifts the frequency targets for all other notes accordingly, ensuring you can tune accurately to the standard used by your ensemble.
- A4 = 440 Hz: The international standard; use for most solo practice, rock, pop, and jazz.
- A4 = 442 Hz: Common in European orchestras and some choirs for a brighter sound.
- Always confirm the reference pitch used by your group, conductor, or accompanist.
Stability controls
To prevent the display from flickering, we apply several layers of filtering and control:
- Median Smoothing: We analyze a small window of the most recent pitch readings and select the median value, which effectively discards momentary spurious jumps.
- Noise Gate: A volume threshold is applied to the input. If the signal is too quiet (below the gate), the algorithm pauses to avoid analyzing background noise.
- Outlier Rejection: If a new pitch reading is drastically different from the previous stable reading, it is temporarily ignored, preventing large, unnatural jumps.
- Minimum Note Duration: A new note is only displayed after it has been detected consistently for a few consecutive frames, providing a “debounce” effect.
- Frame Overlap: Using a 75% overlap between analysis frames ensures a smoother transition between readings, enhancing responsiveness without sacrificing accuracy.
Sample-rate handling & latency
Different devices and browsers can provide audio at various sample rates (e.g., 44.1 kHz or 48 kHz). Our algorithm is sample-rate aware, normalizing calculations to ensure that the detected pitch is consistent regardless of the hardware source. This prevents drift and guarantees that A4 is always detected correctly, providing a reliable experience across desktops and mobile devices.
We aim for the lowest possible latency to provide immediate feedback. The total delay—from audio capture to UI update—is typically under 100 milliseconds. This is a trade-off between responsiveness and stability. A smaller analysis frame would be faster but less accurate for low notes. Our chosen parameters provide a professional balance suitable for most musical applications.
Default parameters
| Parameter | Default | Notes |
|---|---|---|
| A4 Reference | 440 Hz | User can select 442 Hz. |
| Frame Size | 2048 samples | Balances low-frequency accuracy and latency. |
| Hop Size | 512 samples | 75% overlap for smooth updates. |
| Window Function | Hann | Reduces spectral leakage artifacts. |
| Frequency Range | ~55 Hz to ~1000 Hz | Covers A1 to C6, a common musical range. |
| Smoothing Window | 5 frames | Median filter to reject outliers. |
| Noise Gate Level | -50 dBFS (approx) | Ignores quiet background noise. |
| Display Range | ±50 cents | Standard visual range for tuners. |
Reference pseudocode
The core logic can be summarized by the following high-level process:
function processAudio(audioBuffer): // 1. Pre-condition and check volume rms = calculateRMS(audioBuffer) if rms < NOISE_GATE_THRESHOLD: return // 2. Apply window function windowedBuffer = applyHannWindow(audioBuffer) // 3. Find best pitch candidate with Autocorrelation acf = calculateAutocorrelation(windowedBuffer) peakIndex = findPeak(acf, min_lag, max_lag) // 4. Refine the peak for precision refinedLag = parabolicInterpolation(acf, peakIndex) frequency = sampleRate / refinedLag // 5. Smooth the result stableFrequency = medianFilter.add(frequency) // 6. Map to note and render note, cents = mapFrequencyToNote(stableFrequency, A4_REFERENCE) updateUI(note, cents, stableFrequency)
Privacy & local processing
We are committed to user privacy. The Pitch Detector performs all audio analysis directly in your web browser on your computer or mobile device. No microphone data is ever recorded, stored, or sent to our servers. This local processing model not only guarantees your privacy but also provides the lowest possible latency for true real-time feedback. We do not use cookies for tracking and only collect anonymous interaction data for site improvement.
Read our Privacy Policy →Limitations
Every pitch detection algorithm has limitations. For best results, be aware of the following:
- Monophonic Only: The tool can only analyze one note at a time and cannot detect notes within chords.
- Background Noise: High levels of ambient noise can interfere with the algorithm and cause unstable readings.
- Reverberation: Rooms with heavy echo or reverb can smear the audio signal, reducing accuracy.
- Clipping/Distortion: A signal that is too loud and clips the microphone input will produce incorrect results.
- Instrument Transients: The percussive attack of some instruments (like piano or guitar) can be difficult to analyze until the note sustains.
- Extreme Vibrato: Very rapid or wide vibrato may cause the reading to fluctuate between semitones.
Accessibility notes
We strive to make our tools accessible to everyone. Key accessibility features include:
- ARIA Live Regions: Status messages use `aria-live="polite"` to announce updates to screen reader users.
- Keyboard Navigation: All interactive controls, including buttons and selectors, are fully operable via keyboard.
- Sufficient Contrast: Text and UI elements are designed to meet WCAG AA contrast ratio requirements.
- Clear Labeling: All controls have descriptive labels for screen readers.
- No Color-Only Feedback: Information is conveyed through text and position, not just color alone.
Methodology FAQ
Why autocorrelation instead of FFT-only?
While FFT (Fast Fourier Transform) is excellent for viewing a signal's frequency spectrum, it can be prone to "octave errors," where it mistakenly identifies a strong overtone as the fundamental pitch. Autocorrelation is a time-domain method that is generally more robust at finding the fundamental frequency, making it more reliable for this application.
How do you handle vibrato or pitch drift?
The tool's responsiveness allows it to track slow-to-moderate vibrato and pitch drift in real time. You will see the cents meter move smoothly with the pitch. The post-processing and smoothing filters are tuned to follow these intentional pitch modulations without becoming overly sensitive and flickering, providing a stable yet accurate representation.
Will calibration to A4=442 change accuracy?
No, the underlying accuracy of the algorithm remains the same. Changing the calibration simply shifts the target frequencies for all notes. The precision of the cents deviation calculation is unaffected. Our internal accuracy tests confirm this stability regardless of the selected A4 reference, ensuring reliable performance for any standard.
Can it detect chords or multiple notes?
No. This is a monophonic pitch detector, meaning its algorithm is designed to find a single fundamental frequency at a time. When presented with a chord (a polyphonic signal), it will likely become unstable or lock onto the loudest or lowest note, but it cannot separate and identify the individual notes within it.
Changelog
We use semantic versioning (e.g., v1.2.0). The current tool is v1.1.0.
- v1.1.0 (Current) - Added A4 calibration toggle (440/442 Hz). Improved UI for sensitivity controls.
- v1.0.5 - Tweaked median smoothing window for better stability on vocal input.
- v1.0.0 - Initial public release of the pitch detector tool.
