Posted September 01, 2021 by TinyPlay Games
#absylon7 #unity #indie #horror #game #hrtf #audio #article #devlog #cloud #tracing
Hi all. Welcome back to the TinyPlay team. We are currently participating in the IndieCup S’21 contest, where we made the finals in the Best Audio category. In this article I would like to share how we work with audio. I hope this article will be useful for you.
The project consists of 3 components: sounds (footsteps, atmosphere, enemies, weapons, and more), music (atmospheric, action music, extra accompaniment), and voice acting (characters, audio diaries, thunderclaps). The audio architecture is divided as follows:
Sounds:
Music:
Voiceovers:
Compression settings for music and sounds are different. Music – streaming, big sounds are loaded and unpacked before the scene is loaded, small sounds – on the fly.
To achieve smooth transitions, different sound areas, and work with compression, we use the standard Unity mixer. For smooth transitions, we use DOTween.
Mixers are also switched through special Volume-zones that change the active mixer or its settings to suit the environment (for example, to have an echo effect in the basement).
We use a player based on the animation’s event.
We went through several options for working with surround and, most importantly, physically-based sound. We tried Steam Audio, FMOD, Microsoft Acoustics, etc.
In the end our choice was FMOD + Dolby Atmos + Microsoft Acousitcs. Why not Steam Audio? Well, we were getting unforeseen crashes in Windows builds from their library, which we couldn’t debug.
What do we use it all for?
1) For automatic calculation of sound propagation in the environment, taking into account its geometry, materials and reflectivity (each object in the game is configured separately);
2) For spatial processing of audio using HRTF. In conjunction with Dolby Atmos, the sound is as natural as possible.
How does it work?
The system takes into account the spatial position of the sound, its receiving source, angle, and dozens of other parameters for correct perception, both on headphones and 5.1 systems.
In addition, the system uses raytracing (real-time ray tracing) to determine collisions and reflections applicable to sound waves. To reduce the load, these calculations are done on the GPU + sound is baked (similar to Light Probes, but for sound).
Baking sound into a scene with thousands of 3D models and millions of polygons is long enough, so we use Azure cloud computing. Fortunately Microsoft has a great tool on virtual machines designed for this – Azure Batch
Using a cloud computing service it is possible to reduce the sound bake time from 2 hours to 10 minutes, which saves a lot of time at a fairly low price (1.5 bucks per hour).
The combination of these technologies allows us to achieve more or less realistic sound, and it, in turn, can greatly affect the atmosphere of the game.
To see an example of how everything works together you can here (don’t pay attention to the voiceover of the main character, we have already replaced it, but haven’t published it yet):