Computer Vision to Sound in the Browser
Web-based solutions are great when we need to run projects on many different devices without installing or configuring things. It is accessible and easy to distribute. This exercise demonstrates how computer vision data can be mapped directly to sound synthesis in the browser. The system runs entirely in JavaScript, using:
MediaPipe Hands for real-time hand tracking via webcam
WebAudio for browser-based audio synthesis
No installation is required. Everything runs locally in the browser once the camera is allowed:
Expand for Browser Notes (if the example below does not work on your device)
Works best in Chrome / Chromium (desktop).
Use HTTPS or https://localhost and click Start once to unlock audio.
- Chrome / Edge (desktop) – full WebAudio + camera support; best iframe stability.
- Firefox – reliable; may need an extra click in iframes.
- Safari 16+ – HTTPS only; user gesture required; limited iframe camera access.
- Mobile (Android / iOS) – slower tracking, rotation quirks, stricter permissions.
- Older (< 2022) – missing WebAudio or MediaPipe WASM.
- Privacy browsers (Brave / Vivaldi) – may silently block camera or WASM threads.
- Incognito modes – often prevent persistent camera access.
Tip: Refresh after granting permission; use stable Wi-Fi;
avoid nested iframes lacking allow="camera; microphone; autoplay".
The Details
We can download and run the code above on our local machine. This allows us to edit and adjust it as needed:
Save this HTML File by right-clicking and select save as.
- Open the downloaded file with a browser to make sure it is running smoothly on your machine.
An internet connection is necessary for running the file, since it needs to access MediaPipe.
Open the HTML file in your editor of choice to understand and edit.
Signal Flow
The program can be divided into three conceptual blocks:
Vision tracking - The webcam feed is analyzed by the MediaPipe Hands model.
Mapping - These extracted values are scaled through the function
mapXY().Sound synthesis - A WebAudio synthesizer is controlled with the scaled values.
