Many virtual reality systems use only computer graphics to produce immersive
virtual environments. However, sound is an important part of our everyday life
and should be included in the creation of an immersive virtual environment .
For an immersive virtual reality experience, a three-dimensional soundscape
needs to be created. This can be done using either binaural or multichannel
techniques. In binaural 3-D sound reproduction techniques, the principle is
to control the sound signal in the entrances of the listener's ear canals as
accurately as possible. This requirement is easiest to fulfill with headphones.
In multichannel reproduction, multiple loudspeakers are placed around the listener.
With multichannel techniques, it is possible to reproduce sound signals naturally
from correct directions. Figure 1 shows the locations of the 15 loudspeakers
in our current multichannel system.
Acoustics of the Room
There are possibly many problems with the acoustics of the virtual room. There
are back-projected screens and data projectors that hinder arbitrary positioning
of loudspeakers, and typical back-projection screens are not acoustically transparent.
To minimize the influence of the room, it must be as anechoic as possible. This
means that the walls and the ceiling must be covered with absorbent material.
The mirrors and the screens also produce reflections that may be disadvantageous
for sound reproduction. To investigate the acoustics of the EVE, we conducted
a series of impulse response measurements from fifteen loudspeaker positions
to nine microphone positions inside the virtual room. In the measurements we
used a multichannel measurement system with MLS sequences as source signals.
According to our measurements, the reverberation time in our virtual room is
in the order of 400 ms. From our measurements it was also easy to see the effect
of the back-projection screens to the sound. The higher frequencies of the direct
sound are attenuated more than 10 dB when there is a screen between the loudspeaker
and the microphone. The reverberant sound field will bypass the screen, i.e.,
the level of the reverberant sound is hardly at all influenced by the presence
of the screen. The attenuation of the direct sound affects the localization
of the sound sources and blurs directional cues. In all measurements, the level
of the highest frequencies was rather much reduced, and this makes the sound
dull.
Frequency Response Compensation
When the simulated acoustics of a virtual environment is reproduced using loudspeakers,
the room response has to be equalized so that desired acoustical conditions
are faithfully reproduced. Filters that perform spectral whitening can be designed
automatically by using, e.g., linear prediction. With automated design methods,
it is in principle possible to produce a very flat frequency response from the
loudspeakers to the listening area. In our case, the difficulty comes from the
fact that high frequencies are rather much attenuated. If we attempt to achieve
a flat frequency response, the level of the high frequencies needs to be very
high even when using moderate overall sound pressure levels. An easy way to
achieve the desired non-flat frequency response is to pre-filter the original
measured impulse response before automatic flattening-filter design. After trying
out different compensation filter design techniques, we chose FIR filters of
order 25 for practical implementation of spectral compensation. The filters
were designed using linear prediction after pre-filtering the impulse responses
with a filter that slightly boosted the high frequencies. There are a few reasons
for selecting such a small filter order. The computational requirements for
the filters must be very modest; the compensation has to be done for fifteen
channels, and the computer must also be able to simultaneously run the auralization
engine. Additionally, the listening area inside the virtual room is so large
that it is not feasible to try to compensate the response very accurately. The
chosen equalization method provides us with sufficiently uniform timbre across
the whole listening area. It is practically impossible to implement the digital
compensation so that the level of the direct sound would be increased with respect
to the reflections; the only way to achieve this goal is to decrease the energy
of the reflections by adding more absorbent material to the walls of the room.
Sound Reproduction Hardware
The hardware for sound reproduction in EVE is built around one dual-processor
PC computer running Linux operating system. The computer runs special software
that is used for acoustic modeling, sound source panning, and equalization filtering.
The problems with using PC hardware and Linux for sound processing are mostly
due to the immature support for advanced multi-channel sound cards on Linux.
Sound output from the Linux PC is taken from two eight-channel ADAT interfaces
that are connected to two eight-channel D/A converters. The current loudspeaker
system in the virtual room consists of fifteen Genelec active monitoring loudspeakers.
We have also plans to include subwoofer(s) in the system.
Sound Reproduction Software
All software used for sound processing has been written with C++ language using
object-oriented programming approach. The software is split to an efficient
low-level signal processing library and a higher level dynamic signal processing/routing
toolkit. For efficient signal processing needed for sound reproduction, we have
implemented a low-level signal processing library. The library contains base
classes for different types of signal processing blocks, and optimized versions
of several DSP structures. The signal processing units of the low-level library
do their calculations in a sample-by-sample fashion. To complement the low-level
signal processing tools, a high-level signal processing application has been
built. The application -- called Mustajuuri -- is a generic plugin-based real-time
signal processing tool. With proper plugins this application can be used to
run the audio processing for the virtual room. Mustajuuri is available at http://www.tml.hut.fi/~tilmonen/mustajuuri/ .
Mustajuuri is a highly modular signal processing platform. Arbitrary signal
processing modules can be chained in run-time to create a DSP network. Since
most of the functionality comes from the plugins, the system can be easily extended.
New plugin types can be written in C++ and they are loaded by the application
without recompilation. Plugins are chained to create desired signal processing
networks in run-time. By nature, all plugins can deal with audio signals and
with events (for example MIDI events).