Noncontact Gesture Sensing and Responsive Environments
In recent years, more musical devices are being explored that exploit noncontact sensing, responding to the position and motion of hands, feet, and bodies without requiring any kind of controller to be grasped or worn. Although these interfaces are seldom played with as much precision as the tactile controllers such as keyboards with a computer interpreting the data and exploiting an interesting sonic mapping, very complicated audio events can be launched and controlled through various modes of body motion. These systems are often used in musical performances that have a component of dance and choreography, or in public interactive installations.
Although the Russian physicist, cellist, and inventor Lev Sergeivitch Termen (Leon Theremin) developed the famous instrument named after him in 1920, he arrived in New York in 1927, spending a very productive decade there before returning to the USSR (his story is a fascinating one). The theremin was a musical instrument with a radically new free-gesture interface that foreshadowed the revolution that electronics would perpetrate in world of musical instrument design. It used capacitive sensing to measure the proximity of each hand above a corresponding antenna. One hand controlled the pitch of a monophonic waveform while the other hand controlled amplitude. The theremin was a worldwide sensation in the 20’s and 30’s. RCA commercially manufactured these instruments, and several virtuoso performers developed, the most famous being Clara Rockmore. During his stay in the USA, theremin invented several variations of this instrument, including the large terpsitone, which responded to the position of dancers’ bodies. Robert Moog began his electronic music career in the 1950’s by building theremins, which had by then descended into more of a cult status, well away from the musical mainstream. Theremins are once again attaining some notoriety, and Moog (through his present company, Big Briar) and others are again producing them.
At the MIT Media Lab, we have developed many musical interfaces that generalize capacitive techniques, such as used in the theremin, into what we call “Electric Field Sensing”. These include the Sensor Chair (originally designed for the musicians Penn and Teller; tracks hands and feet of a seated participant), the Gesture Wall (developed for the Brain Opera; tracks body motion in front of a video projection), the Sensor Frames (open frame that tracks hand position), and the Sensor Mannequin (self-explanatory; designed for a collaboration with The Artist Formerly Known as Prince.)
Several research labs and commercial products have exploited many other sensing mechanisms for noncontact detection of musical gesture. Some are based on ultrasound reflection sonars, such as the EMS Soundbeam and the “Sound=Space” dance installation by longtime Stockhausen collaborator Rolf Gelhaar. These generally use inexpensive transducers similar to the Polaroid 50 kHz electrostatic heads developed for auto-focus cameras, and are able to range out to distances approaching 35 feet. While these sonars can satisfy many interactive applications, they can exhibit problems with extraneous noise, clothing-dependent reflections, and speed of response (especially in a multi-sensor system), thus their operating environment and artistic goals must be carefully constrained, or more complicated devices must be designed.
Infrared proximity sensors, most merely responding to the amplitude of the reflected illumination, are being used in many modern musical applications. Examples of this are found in the many musical installations designed by interactive artist Chris Janney, such as his classic SoundStair, which triggers musical notes as people walk up and down a stairway, obscuring or reflecting IR beams directed above the stair surfaces. Commercial musical interface products have appeared along these lines, such as the Dimension Beam from Interactive Light (providing a MIDI output indicating the distance from the IR sensor to the reflecting hand), and the simpler Synth-A-Beams, OptiMusic, and LaserHarp family of MIDI controllers, which produces a corresponding MIDI event whenever any of several visible lightbeams are interrupted. One of the most expressive devices in this class is the “Twin Towers”, developed by Leonello Taraballa and Graziano Bertini at the CNUCE in Pisa, which respond to hand inclination as well as proximity.
Other noncontact optical tracking devices have been built, such as the Videoharp, introduced in 1990 by Dean Rubine and Paul McAvinney at Carnegie-Mellon. This is a flat, hollow, rectangular frame, which senses the presence and position of fingers inside its boundary. At the Media Lab, we have built a much larger sensitive plane, using an inexpensive scanning laser rangefinder that is able to resolve and track bare hands crossing the scanned plane within a several-meter sensitive radius. We have used this device for multimedia installations, where performers fire and control musical events by moving their hands across the scanned areas above a projection screen.
Although they involve considerably more processor overhead and are generally still affected by lighting changes and clutter, computer vision techniques are becoming increasingly common in noncontact musical interfaces and installations. For over a decade now, many researchers have been designing vision systems for musical performance, and steady increases in available processing capability have continued to improve their reliability and speed of response, while enabling recognition of more specific and detailed features. As the cost of the required computing equipment drops, vision systems become price-competitive, as their only “sensor” is a commercial video camera.
Interactive digital video installations have been introduced with increasing complexity ever since the American artist and computer scientist Myron Krueger’s groundbreaking Videoplace installations began in 1976. Others followed in the subsequent decade were promptly used for interactive music and dance. These included the Mandala Sillouette Tracker by the Vivid Group in Toronto, the Oculus Ranae by Collinge and Parkinson at the University of Victoria, the 3DIS system by Simon Veitch in Melbourne, Austrailia, and the Very Nervous System (VNS) by David Rokeby in Toronto. Rokeby continues improving his device, which is still extensively used by many artists, including Todd Winkler at Brown University, who employs the VNS for interactive dance. Modern vision systems tend to be all software, doing real-time vision operations in a single desktop computer. These include “BigEye”, a Macintosh package written for interactive musicians by Tom DeMeyer and his colleagues at STEIM Amsterdam, and Pfinder, developed by Chris Wren and colleagues at the MIT Media Lab’s Perceptual Computing Group. Pfinder goes beyond most other systems, which track only motion or activity in specific zones; it segments the human body into discrete pieces, and tracks the hands, feet, head, torso, etc. separately, giving computer applications access to gestural details. Flavia Sparacino has used Pfinder as a music controller for interactive dancers, attaching specific musical events to the motion of their various limbs and body positions. Using her Dancespace, one essentially plays a piece of music and generates accompanying graphics by freely moving through the camera’s field of view. Such techniques allow a dancer to be freed from the constraints of precomposed music and control the sound with their improvisational whims, as elucidated in the dance community by visionary choreographer Merce Cunningham.
As motion of the feet is important for interactive dance applications and is difficult to sense remotely, several researchers have explored building sensitive floors, which are able to sense position and pressure of footfalls. These have included the pixellated dance floor by Russell Pinkston at UT Austin, the LiteFoot project by Mikael Fernstrom at the University of Limerick, the Taptiles from Infusion Systems in Canada, the Einway Pianos by Washington artist Einar Ask, and the carpet tiles from Interactive Entertainment Systems. A hybrid system has been built by the author and colleagues at the MIT Media Lab that combines both free and contact sensing in an unusual fashion. Termed “The Magic Carpet”, it consists of a 4″ grid of piezoelectric wires running underneath a carpet and a pair of inexpensive, low-power microwave motion sensors mounted above. The sensitive carpet measures the dynamic pressure and position of the performer’s feet, while the Doppler signals from the motion sensors indicate the signed velocity of the upper body. The Magic Carpet is quite “immersive”, in that essentially any motion of the performer’s body is detected and promptly translated into expressive sound. We have designed several ambient soundscapes for it, and have often installed this system in our elevator lobbies, where passers-by stop for extended periods to exploring the sonic mappings.
Other methods of tracking, monitoring, and identifying different objects are completely changing the concept of a musical instrument. In the intelligent environments of the near future, nearly everything can have musical behavior (witness how the simple mouse clicks and key presses that I’m making in Microsoft Word98 as I type this article are launching various complicated sounds). An extended vision of this concept is illustrated in work that we’re now pursuing at the MIT Media Lab with resonant magnetic tags. These are small, battery-less modules that we can bury into objects, allowing us to track their identification, their distance from a fixed reader (e.g., placed on a tabletop), their orientation, and the state of the object (e.g., squeezed, etc.). All parameters are updated 30 times per second, which makes it possible to employ this system for simple musical performance. In one example, called the Swept RF Tagging Project (which is inspired by John Zorn’s improvisational performances with a tabletop full of musical objects during the mid-80’s), we give an assembly of 13 objects continuous behavior as they are moved near the reader. Some launch velocity-sensitive notes, others play sequences, and others continuously modify the voices that are already sounding. The performer “plays” the system by picking up the objects and bringing them near the reader, where they are manipulated. Most objects are free-standing, and can remain near the reader so their effect continues hands-free, whereas others are “rings” that are worn on the hand.
From American Innovations in Electronic Musical Instruments
by Joseph A. Paradiso
© 1999 NewMusicBox