IEM Control with Mediapipe

Assignment: Camera-based Tracking with OSC and IEM

The goal of this assignment is to connect camera-based tracking in Python to the IEM Plug-in Suite inside REAPER. This allows the control of a spatial audio scene through movement and gestures. At the core of this is a Python script tracking head or hands that sends OSC messages with normalized coordinates via OSC. These OSC messages will be used to control the IEM plugins. The signal flow for this looks as follows:

/images/spatial/mediapipe_nopd.png

Signal flow without PD.

The connection can be realized with Pure Data (Pd) as additional bridge: Pd receives tracking data via OSC from Python, parses the messages, and forwards them to the IEM plug-ins, extending the remote control patch from the previous section. This additional step can make parameter mapping easier during development:

/images/spatial/mediapipe_pd.png

Signal flow with PD.


Provided Material

  1. IEM Remote Control Example in Pd
  2. OSC Parser Patch (Pd)

    • Receives OSC on port 9000

    • Parses the incoming messages from Python

    • Extracts /head/pos or /hand/.../pos messages

    • Forwards the coordinates (X, Y) internally in Pd

/images/spatial/osc_parse.png

OSC parser patch in PD.


Task

The main task is to design and implement the Python part:

  • Use a webcam as input.

  • Detect head or hand(s) in real-time. Use mediapipe and OpenCV for hand or face/head detection.

  • Normalize the coordinates to the range [0, 1].

  • Send OSC messages from the python script, using python-osc. Possible OSC messages are:

/head/pos [x, y]
/hand/left/pos [x, y]
/hand/right/pos [x, y]



* X and Y should be relative to the camera image, top-left = (0,0).
* (Optional) include extra information like bounding box or visibility.

Hints

  • Use (at least) the following Python modules:

import cv2
import numpy
import mediapipe
from pythonosc.udp_client import SimpleUDPClient
  • Start from a minimal working prototype:

    • open camera → get positions → print them.

    • then add OSC transmission.

    • finally, connect to Pd and the IEM plug-ins.

  • Test your messages in Pd using the provided parser patch before going into REAPER/IEM.


Advanced Mode

Advanced students can use any audio software (PD, SuperCollider, Max, ...) and arbitrary tracking solutions to control spatial audio with gestures and movement. YOLO is an alternatve for detecting multiple objects and persons: https://docs.ultralytics.com/