Audio Visual Synthesis

Audiovisual synthesis - has been a cherished artistic goal for at least a hundred years. Among the artists who have devoted their time to the concept are composer Alexander Scriabin, artist Paul Klee, and animators Oscar Fischinger and Ron Pellegrino. In addition there have been films that have explored the concept, such as Disney's Fantasia and David Lynch's Eraserhead. The ways and means by which the concept may be given flesh are manifold. I have chosen to concentrate on one particular area that I believe holds great promise: that of AVS in the field of electronic dance music and abstract visuals, with MIDI as the means of interface.

Techno music, the cutting edge of electronic dance music, always seemed to be the place where the synthesis was most likely to happen with spectacular results, mainly because the music itself was both sufficiently popular and sufficiently audiovisual already, at least in a club or party environment. Techno pioneers Kraftwerk, in their live shows and videos, have made moves in the direction of AVS. At present, however, most visuals for techno music are still colourful pastiches of fast strobing imagery with no obvious, intuitive connection to the actual sounds and rhythms of the music. Techno visuals are associative, not synaesthesic. Simon Reynolds, in Stylus magazine, argues that the pyrotechnical visual of modern abstract techno videos simply razzle the visual cortex, which cannot cope with their complexity as easily as the auditory cortex, with its greater sensitivity:

"Perhaps there is a fundamental and unbridgeable gap between the ear and the eye in terms of their different capacities to cope with intensity. Take rhythm: the subdivision of time gets ever more fantastically complex in dance music--micro-syncopations, asymmetrical rhythmic patterns riddled with hesitations, multiple tiers of polyrhythm. (And it's not just the drums and the bass-- most of the musical elements in dance records, from the keyboards to the vocals, are rhythmatized and function as cogs in the groove). But to render this implosive internal intensification of rhythm visually would not only be extremely challenging on the technical level, but also much more taxing or even traumatic to the viewer. The ear seems to be able to cope better both with faster rhythms and with internal rhythmic complexity (hyper-syncopation, offbeat, cross-rhythms and counter-rhythms). The ear can not only apprehend the staggering complexity of modern dance rhythms - rhythmic simultaneism (multiple percussive patterns interlocking and overlapping), timbral treatment of the beat and the riffs to create textured rhythm and rhythmatized texture, spatial organization of rhythm in the stereo-field - it can also enjoy them."

Reynolds is both excited by the possibility true AVS in dance music and pessimistic about the overcoming of inbuilt perceptual limitations of the human sensorium. However, the seeds of a solution to the problem he identifies already exist, or are implied, but his own analysis: all that needs to be done is to make the visuals simpler - much simpler. They need to become abstract, geometric, and totally synchronised to a rhythm part. The 'dazzle' effect is produced by extraneous or irrelevant visual components not synchronised to the music. Total synchronicity of all visual objects to sonic parts should reduce the dazzle effect to zero.

The Method
MIDI, or Musical Instrument Digital Interface, allows for MIDI-equipped musical instruments and computers to be synchronised to a digital clock; and for one instrument to control another. It also allows musical instruments to synchronise to, and control, MIDI-equipped visual software packages. This fact is the key to true AVS in dance music. It means that the spatial location, colour, duration, and timing of the appearance of a visual object on a monitor can be entirely controlled by a musical event. In other words, a sound and a visual object may be exactly synchronised.
Music can therefore be composed simultaneously with the composition of a video clip. The technology to do so already exists.

Minimal dance music - the kind of music that is currently called 'techno' - is the blueprint for the audio side of the equation. It is more suitable than acoustic and sample-based music because, at present, synthetic waveforms are simpler and easier to replicate visually (this may change in due course; see, for example, what Vislab are doing with the visual representation of acoustic sounds - reference below).
Techno is probably the only music that, in its purest form, is entirely synthesised: in pure techno there are generally no samples, no vocals, and no live instruments. Being perfectly electronic, geometric, and highly rhythmic, it lends itself rather well to being visualised as abstract visual patterns.
To illustrate this point, it may be most effective to imagine ourselves in a studio writing on musical instruments and visual design software using MIDI and a set of standard drum sounds. We would begin by loading up a set of kit sounds - kick, snare, hats, rim, toms, clap and so on - and then matching them to a set of visual objects that replicate, to the best of our ability or aesthetic understanding the characteristics of each of those sounds. For the sake of this imaginary experiment, we can keep the elements simple so as not to create too much confusion: but the result, in spite of the simplicity, should be creatively inspiring.

Let's say the first thing we lay down is a 4/4 kick pattern. We then assign the MIDI note of the kick drum to the MIDI note of a visual object created by our MIDI-equipped visual software package. Now, a kick drum in dance music is typically low frequency, round in shape and large. Let us assume that the simplest way we can represent these sonic characteristics is with a large black dot or blob in the lower half of the screen. An open hi-hat is high frequency, thin, metallic and extended. Let us assume that the simplest way we can represent it is as a silver streak near the top of the screen. For the sake of simplicity, we will place the hat on the offbeat, as is common in many dance tracks. Each time it sounds, a silver streak appears on the screen. Visually, therefore, the visual kick drum and the visual hi-hat will succeed each other as a one-bar cycle, just as they do in the music.
Now let us assume that we built up a whole visual percussion kit based entirely on the sonic one; and that, at the precise instant each sonic percussion sound sends a MIDI note to the visual software, the object we have created to designate that sound appears on the screen. We now have visual rhythms that pulse in time to the musical rhythms.

There is much more that can be done to flesh out this concept in practice. We can experiment with backgrounds; we can control the shape of the objects in real time; we can pan objects in the visual field to correspond to auditory stereo pans; we can aim for more complex, 3-D shapes with shadows or surface textures; we can have objects bouncing from the top to the bottom of the screen, or from the 'horizon' to the 'front' of the screen, in time to the auditory rhythm counterpart; we can have some rhythms that are solely visual, with no auditory counterpart or solely auditory, with no visual counterpart. We can give objects shadows and halos to reflect the application of reverberation; we can simulate a delay repeat visually by having smaller versions of the object repeating after the primary object, creating a 'comet tail'.
The basic principle, however, is that any sound can be assigned to an abstract visual object which represents the sonic qualities of that sound via MIDI control, so that both sounds and visual objects are synchronised as one.
This concept has a variety of applications:

(i) For DJing and live performance in clubs
Some DJs have lamented the static nature of a typical DJ performance as compared to, say, a rock show. In a recent interview with John Osselaer in TechnoTourist, techno producer and DJ Jeff Mills commented that:
"I could never really figure out the visual aspect. While listening to the music, what could possibly be interesting? Usually it's just standing there watching a guy turn knobs, it's just not interesting. In the past I've thought about ideas of using illusionists, magicians, to kind of create some kind of illusion where I might disappear and reappear, just all types of really interesting and odd things... What I had hoped to get from that was the indication that the people wouldn't mind dancing to an image and that would lead us into this holograph thing where at a certain point they would actually be dancing to a simulation of myself or multiple DJs."
Every committed DJ needs his or her performance to be an enveloping event, not simply an accompaniment to a good night out. But what does an audience look at, if not the DJ? Lasers, frequency responsive coloured lights, and the traditional visual arsenal of dance parties once again hint at AVS without quite delivering the requisite synchronisation, which must be perfect, not simply approximate.
Creative DJs have frequently used more than just a set of turntables for performance: drum machines, effects boxes and sequencers have all been part of the arsenal of DJs who want to really get into the nuts and bolts of the grooves they are playing. I think it is possible to envision a set-up whereby a DJ plays prewritten audio tracks which have been paired with prewritten visual tracks of the kind described in method above. These synchronised visual tracks would then be projected on to large screens, either behind and above the DJ, or over the walls and ceiling, to become the primary focus of the dancer's visual attention.

Of course DJs like their vinyl. It is hard top see how conventional vinyl could be incorporated into an AVS set-up as there could be no MIDI control. However, recent innovations such as Stanton's Final Scratch (see References) allow a DJ to control mp3s in real time from a pair of turntables, utilising all the traditional techniques of the DJ craft such as spinbacks and cuing. In other words, final scratch allows DJs into the digital domain, and, potentially into the audiovisual domain as well.
Moving away from the DJ set-up to the live P.A. (though it must be said that the two are increasingly difficult to distinguish), one can imagine a scenario where the music is sequenced, but there is some live input, in the form of post-sequencing overlay and modification through live intervention via knobs, sliders, joysticks etc, in both the audio and visual domain. A live outfit could then be composed of two members, one controlling aspects of the audio sequencing (eg, filter sweeps, levels, sound effects), the other controlling aspects of the visual sequencing (eg, colours and textures, sizes of objects, visual effects) responding spontaneously to the moment-by-moment synergy between the act, the environment and the dancers.
Finally, as a related line of thought, most clubs now have digitally controlled lighting, some of which is MIDI fitted. It should be possible to feed MIDI streams into these lighting set-ups, and thereby generate lighting displays that are fully synchronised to the rhythms, where those rhythms are digital and are sending MIDI clock/tempo information.

(ii) In Cinemas
This application is like going to see a film, but instead of a movie, the audience is exposed to a piece of music and a large screen projection of the visual rhythm pattern.

(iii) AVS Headset
Here the listener uses headphones with a visual head-up display and plays pre-recorded AVS material from a disc or hard drive.

(iv) Home Viewing
Here the listener plays AVS recordings on speakers and a screen in the home environment.
Conclusion: Some common objections
I have been knocking the above ideas around with a group of musically and visually oriented friends for about ten years now, and have raised the concepts with all kinds of people from all over the world - and received all kinds of responses. Among the most common reactions on the less enthusiastic side are the following:

(i) It's been done before
People sometimes claim that the concepts outlined here are nothing new, and have been explored and realised in the past by various artists and musicians. Most of these claims hold at least a grain of truth; however, my experience is that, although isolated attempts to realise AVS have been made, these nevertheless full short of a full fledged rhythmic synchronisation of sound and vision in rhythm. Historically speaking, what we have are little experiments on the perimeters of art and music, which are rarely ever connected to electronic dance music and which rarely involve thoroughgoing audiovisual rhythm synthesis. Sometimes such ideas are hinted at, however, or appear briefly in the context of a more traditional approach.
Mostly, however, when people raise this objection, what they mean is really that the technology of AVS is already being used. Therefore, it should be made as clear as possible that what is being proposed here is an aesthetic and artistic concept. It is not a technological innovation that is being proposed: I am not suggesting we start using AVS equipment, in ignorance of the fact that it is already being employed. The existence and use of AVS technology has not culminated in the aesthetic approach described here, and, as a result, has yet to find its true place in electronic dance music. In the same way, the existence and use of the Roland TB-303 in 1981 was a necessary, but not sufficient, precondition for the creation of acid house some six years later. The Fairlight sampler existed for over a decade before hip-hop and other sample based music appeared. There are many such examples. Until such techniques are cheap, available, and widely applied, they never reach the critical mass required for a new artistic movement.
Moreover, it almost goes without saying that the fact that these ideas may have been realised before is hardly a justification for never realising them again. With so a philosophy, there would be no artistic continuity and no artistic growth or mutation.

(ii) It doesn't sound very interesting
This objection is rather difficult to answer, since the concepts outlined above have rarely ever been seen, let alone applied in any thoroughgoing way. Perhaps, once it is applied, the results won't be very interesting - we can't be sure. However, my experience is that whenever the idea is even approached, even in a tangential way, audiences get very excited indeed. Consider, for example, the public's continuing response to Disney's Fantasia. I believe that if AVS is applied exactly as described above, the impact on the audience will be profound - more profound than any solely audio or visual experience is intrinsically capable of being. However, only practise will settle this matter: one should not prejudge the issue, and therefore I do not consider this claim, which is really a cassandric projection rather than an objection, to be an obstacle to the project.

(iii) It'll never beat dancing in the dark
This argument is advanced by Simon Reynolds. I have a lot of sympathy with it. The best, the most memorable, the most visually evocative and stimulating dance events are underground and held in near total darkness. The mind is free to dream, to visualise the music, and one feels less self-conscious, freer, under such conditions. The music itself becomes more dimensional, tactile and colourful. One finds oneself really concentrating on the music, less distracted by the bells and whistles of the club world - lasers, DJs, drinks, sexually attractive men and women, and so on. It's a curiously introverted kind of enrichment, nourished only by the kinaesthesia of the music, one's body, and the rhythmic play between the two.
On the other hand, though, this argument doesn't really amount to an objection to AVS in clubs. Rather, it is analogous to the assertion, equally true, that it's often better to listen to the radio rather than watch TV; that it's often better to read a book than see a film; to make love in the dark rather than in the light. Sometimes a restriction of input is more desired than an increase of stimulation, especially if you're self-conscious, or a natural introvert. However, that is no argument against AVS per se, though it may be an argument for maintaining a plurality of club environments with a variety of degrees of visual stimulation.

Above all, the obvious response to objections like these, which attempt to second-guess the aesthetic and psychological impact of a total AVS environment, is to ask, why not just try it and see? It might be better, it might be worse; it might be different, but just as exciting in its own way. We won't know until we've tried.

(iv) It's a good idea except for the minimal techno aspect
This is by far the most common objection, namely that, although AVS is an excellent idea, and well worth implementing, minimal techno has far too limited an appeal to really be appropriate as the musical part of the equation: it lacks narrative, lyrics, a story, and so on; it has been around for about a decade and is perceived as dated; it is too purist; and, well, most people just don't like it. It's not popular enough to carry the audio side of the project. I have lost the interest of more than one very talented visual composer because of my insistence on this kind of music being a prerequisite for the sonic component, as opposed to, say, a full orchestra or a jazz band, or a popular house act, or hard rock, or something in the electroclash vein, or whatever. Many of the most skilled people in visual animation have never heard of minimal techno and wouldn't like it if they had.

Additionally, there is the question of the absence of lyrics and pop structures in minimal techno. One very talented programmer insisted to me that kind of AVS without a narrative, a storyline, would be boring, as all good drama requires one. To the contrary, I believe that it is precisely the fact that the approach is utterly devoid of narrative content that will give the resultant work such a powerful immediacy and impact; and as minimal techno has no lyric or narrative structure, but is wholly sensate in nature (it simply appeals to us on a very primal, seeing/hearing/feeling level, without any obvious superstructure of explicit storytelling), it is perfect for propelling our understanding of AVS beyond the largely verbal approach that have dominated the medium thus far, and indeed most of our waking life. As all composers of instrumental music know, nonverbal communication is extremely powerful, as it bypasses all the rules and codes we create through language and goes straight for the instincts, straight for the body. Probably no single body of individuals know this better than techno music producers.

There is, however, at least one very important reason why it is my firm belief that the new art form will only take to the wing when it is realised in the context of minimal techno, and that is, as mentioned before, down to the fact that we already have the visual technology to translate simple, elemental abstract sounds into simple, elemental, abstract visuals. More complex waveforms such as those found in acoustic instruments and the human voice is not yet readily accessible to visual transduction. Even if it is some day, the results could look rather messy where their sonic equivalents are not.
However, it's not necessary to make predictions about how complex acoustic sounds will look, when it can just be appreciated that AVS in electronic dance music is really a new art form, and is best explored, at first, by methods that are parsimonious, spare, and intuitive, and which yield quick results. Nothing creative is going to come out of any process that plunges the artists into logistical nightmares. We need to work out some basic processes that can yield results in the time it takes to write a piece of dance music - one or two days. I envisage sparse rhythm tracks, utilising kicks, hi-hats, a smattering of percussion and effects, and simple synthetic melodies and effects. When manifested as visuals and music together, the results, I believe, will be more than sufficient to entice and delight the body and the senses. Anything too complicated will, I suspect, risk producing dazzle and confusion, at least for the time being when the novelty factor is so high. Yes, most minimal techno is extremely boring right now; however, I believe it is only missing the vital extra ingredient that would reinvigorate it.

Finally, and at the risk of invoking some controversy, I really doubt that other genres of music are composed of sufficiently forward-looking individuals to even conceive of an AVS art form, much less put it into practise. Is such a thing going to emerge from the New School Breaks scene? From R 'n' B? I doubt it. I'm not saying that techno doesn't have its fair share of historically oriented listeners and producers, but, in my experience, they are proportionally fewer than in other areas of music - which is hardly surprising, as techno explicitly identifies itself with innovation.
The fact is that techno romances the future, and new ideas and systems in all forms. It is therefore natural that techno events and club nights tend to attract creative and technologically literate artists from both the audio and visual spheres. These individuals find themselves in club environments where the synthesis of sound and vision has a direct impact on the success of the event; and in the same room as, if not actually conversing with, their computer literate counterparts. Both DJs and VJs increasingly utilise the same computer hardware to make these events as memorable and spectacular as possible; really, all that is required is a MIDI cable and a little conversation.

Copyright ? 2003™ All artists rights reserved.