EVE - Acoustics and Audio

Many virtual reality systems use only computer graphics to produce immersive virtual environments. However, sound is an important part of our everyday life and should be included in the creation of an immersive virtual environment .

3-D Sound Reproduction Techniques

For an immersive virtual reality experience, a three-dimensional soundscape needs to be created. This can be done using either binaural or multichannel techniques. In binaural 3-D sound reproduction techniques, the principle is to control the sound signal in the entrances of the listener's ear canals as accurately as possible. This requirement is easiest to fulfill with headphones. In multichannel reproduction, multiple loudspeakers are placed around the listener. With multichannel techniques, it is possible to reproduce sound signals naturally from correct directions. Figure 1 shows the locations of the 15 loudspeakers in our current multichannel system.

Acoustics of the Room

There are possibly many problems with the acoustics of the virtual room. There are back-projected screens and data projectors that hinder arbitrary positioning of loudspeakers, and typical back-projection screens are not acoustically transparent. To minimize the influence of the room, it must be as anechoic as possible. This means that the walls and the ceiling must be covered with absorbent material. The mirrors and the screens also produce reflections that may be disadvantageous for sound reproduction. To investigate the acoustics of the EVE, we conducted a series of impulse response measurements from fifteen loudspeaker positions to nine microphone positions inside the virtual room. In the measurements we used a multichannel measurement system with MLS sequences as source signals. According to our measurements, the reverberation time in our virtual room is in the order of 400 ms. From our measurements it was also easy to see the effect of the back-projection screens to the sound. The higher frequencies of the direct sound are attenuated more than 10 dB when there is a screen between the loudspeaker and the microphone. The reverberant sound field will bypass the screen, i.e., the level of the reverberant sound is hardly at all influenced by the presence of the screen. The attenuation of the direct sound affects the localization of the sound sources and blurs directional cues. In all measurements, the level of the highest frequencies was rather much reduced, and this makes the sound dull.

Frequency Response Compensation

When the simulated acoustics of a virtual environment is reproduced using loudspeakers, the room response has to be equalized so that desired acoustical conditions are faithfully reproduced. Filters that perform spectral whitening can be designed automatically by using, e.g., linear prediction. With automated design methods, it is in principle possible to produce a very flat frequency response from the loudspeakers to the listening area. In our case, the difficulty comes from the fact that high frequencies are rather much attenuated. If we attempt to achieve a flat frequency response, the level of the high frequencies needs to be very high even when using moderate overall sound pressure levels. An easy way to achieve the desired non-flat frequency response is to pre-filter the original measured impulse response before automatic flattening-filter design. After trying out different compensation filter design techniques, we chose FIR filters of order 25 for practical implementation of spectral compensation. The filters were designed using linear prediction after pre-filtering the impulse responses with a filter that slightly boosted the high frequencies. There are a few reasons for selecting such a small filter order. The computational requirements for the filters must be very modest; the compensation has to be done for fifteen channels, and the computer must also be able to simultaneously run the auralization engine. Additionally, the listening area inside the virtual room is so large that it is not feasible to try to compensate the response very accurately. The chosen equalization method provides us with sufficiently uniform timbre across the whole listening area. It is practically impossible to implement the digital compensation so that the level of the direct sound would be increased with respect to the reflections; the only way to achieve this goal is to decrease the energy of the reflections by adding more absorbent material to the walls of the room.

Sound Reproduction Hardware

The hardware for sound reproduction in EVE is built around one dual-processor PC computer running Linux operating system. The computer runs special software that is used for acoustic modeling, sound source panning, and equalization filtering. The problems with using PC hardware and Linux for sound processing are mostly due to the immature support for advanced multi-channel sound cards on Linux. Sound output from the Linux PC is taken from two eight-channel ADAT interfaces that are connected to two eight-channel D/A converters. The current loudspeaker system in the virtual room consists of fifteen Genelec active monitoring loudspeakers. We have also plans to include subwoofer(s) in the system.

Sound Reproduction Software

All software used for sound processing has been written with C++ language using object-oriented programming approach. The software is split to an efficient low-level signal processing library and a higher level dynamic signal processing/routing toolkit. For efficient signal processing needed for sound reproduction, we have implemented a low-level signal processing library. The library contains base classes for different types of signal processing blocks, and optimized versions of several DSP structures. The signal processing units of the low-level library do their calculations in a sample-by-sample fashion. To complement the low-level signal processing tools, a high-level signal processing application has been built. The application -- called Mustajuuri -- is a generic plugin-based real-time signal processing tool. With proper plugins this application can be used to run the audio processing for the virtual room. Mustajuuri is available at http://www.tml.hut.fi/~tilmonen/mustajuuri/. Mustajuuri is a highly modular signal processing platform. Arbitrary signal processing modules can be chained in run-time to create a DSP network. Since most of the functionality comes from the plugins, the system can be easily extended. New plugin types can be written in C++ and they are loaded by the application without recompilation. Plugins are chained to create desired signal processing networks in run-time. By nature, all plugins can deal with audio signals and with events (for example MIDI events).