About soundbank optimization, I asked a practitioner in a related industry, and his suggestion was: “Don’t split the soundbanks into too many parts. If it is only for categorization, I suggest categorizing inside the events. But since you already made them, you might as well just load them all in WwiseGlobal, and that also saves you from loading them again and again in scripts.”
I followed his suggestion, because after thinking carefully, according to my logic, wouldn’t something like a UI Click have to load the UI Bank every time it was clicked? It is still okay while the project is small, but if it became a larger project, the optimization would completely explode.
And I also asked around and found that different Soundbanks exist so that not everything stays loaded all the time, thereby reducing the runtime burden…
For example, when the player is near a town, you can load a SoundBank containing ambient sounds (voices, footsteps, marketplace atmosphere, etc.); when the player gradually moves away from the town, these sounds are no longer important, so the corresponding SoundBank can be unloaded, thereby reducing memory usage.
When switching between different scenes (for example forest / cave / town), different SoundBanks can be used respectively, loading the corresponding ambient sounds when the player enters an area, and unloading them when the player leaves, thereby avoiding unrelated audio occupying resources for a long time.
For UI sound effects (such as button click sounds), they are usually placed in a separate smaller SoundBank and loaded once during game initialization, rather than being loaded repeatedly every time they are triggered.
First, the most basic thing: enable “spatial (3D) sound” in the game, and make the volume decrease as the distance increases.
Adjust the curve so that the attenuation feels closer to real life.
In practice, this can also be used to control how much a player hears certain sounds. For example, if a campfire sound is not very important, we can make it attenuate faster to prevent it from distracting the player.
In reality, sound attenuation is non-linear, so we need to manually adjust the curve, make the change more noticeable at close range and more gradual at long range.
This can control whether the player “notices a sound or not.”
Important sounds (enemy/footsteps) → slower attenuation → easier to hear
Basically, attenuation is a design tool, not just a physical effect 🤔
Distance Affecting Timbre (Filter)
When the distance between the player and the sound source changes, what we hear doesn’t just change in volume.
As distance increases, the sound not only becomes quieter but also “muddier,” because high frequencies are more easily absorbed by air, while low frequencies travel further. (Like when you’re next door to a livehouse: you might not hear the song clearly, but you can easily hear the bass “dum-dum.”)
In this case, we can use low-pass / high-pass filters to make the sound change dynamically with distance. For example, distant sounds are more muffled, while nearby sounds are clearer. So distance doesn’t just make a sound quieter, it also makes it blurrier.
Because with distance, what we hear doesn’t only change in volume. Different frequency ranges have different penetration abilities, so we can also appropriately add low-cut and high-cut.
Spread
If we don’t add spread, the sound will feel very “point-like” (sharp), and when moving left and right, the direction change will be very extreme.
But in reality, within a certain range, sound doesn’t come strictly from a single point—it has some diffusion. So we should increase spread appropriately to make it more natural and avoid the sound jumping crazily from left to right.
Of course, sometimes we can choose not to add it. For example, I think collectible sounds in Resident Evil 9 don’t need spread, since they’re hidden, you can locate them by sound direction. Similar puzzle designs can also use this as a clue.
ShareSet / Property Container
When a group of sounds (like footsteps or ambience) needs consistent rules, don’t tweak them one by one.
We can create a property container
Use a ShareSet
So that a whole category of sounds uses the same settings
This way, we can keep a consistent style, update everything at once, and avoid chaos. If we need individual adjustments, we can still tweak them separately.
Game audio is very different from film audio. It’s not just “playing sound,” but using distance / frequency / space / system management to control what the player hears, what they pay attention to, and what they ignore.
Listener
Wwise places the listener on the camera by default, but the camera is not always equal to the “player’s ears” (for example, in our current project The Verdant Trail).
For fixed camera or third-person views, using the camera directly may not be appropriate.
So we need to disable the listener on the camera rig, create an empty object, and attach the listener to it, so it follows the character (or the desired listening position).
This script is essentially creating an “ideal ear”:
The ear is on the player’s body (correct position), but the listening direction matches the camera (comfortable perception).
Because our hearing is related to orientation, the listener’s rotation affects spatial perception and weighting of sounds.
In 3D games, we usually let the “ears rotate with the camera,” so the world sounds natural to the player.
transform.rotation = mainCam.transform.rotation;
This line ensures that orientation and hearing direction are aligned.
But since our game uses a fixed camera, it’s actually… not that useful, because the camera direction is always fixed. However, in a first-person game, this would be very important.
Gameplay is divided between the more traditional visual novel aspect of the dialogue, and the free exploration of the player. The dialogue is like a visual novel style. The background is a blurred image of the in-game camp, to save the need for too many illustrations. Conversations have a main backbone but can branch into different directions depending on the player’s dialogue choices. The involved characters will be rendered behind the text bubble where their dialogue appears, as well as Tyler’s dialogue options. Meanwhile, the player in the exploration mode is able to explore the map without any constraints, in a traditional top-down pixel art style.
My initial idea is that we will have a main storyline, but players are not required to blindly follow it from start to finish. Of course, they can do so, but that would lead directly to a default ending.
If players want to explore more content, they will need to actively explore the map to gather additional information. In other words, exploration and the visual-novel experience are not two separate gameplay modes (by “separate modes” I mean something like Red Dead Redemption 2, where story mode and online mode are completely split). These two ways of playing exist simultaneously, and which one the player leans toward is entirely their own choice.
Rather than being a choice of gameplay styles, these two “modes” are more about controlling the density of information.
So far, based on Erik’s GDD, I can confirm that what we need is:
A main storyline
A collection of scattered side content, including some environment-based narrative elements (such as text triggered during interactions)
A multiple-ending system
With a clearly guided main storyline, we will likely need corresponding UI or HUD elements to track main story progress. If there are important side quests, those could be tracked as well. For example, in Stardew Valley, players can check accepted quests and see what items are required to complete them.
Sierra and Marcus both have a reputation bar scaling from 0-100. The gameplay depends on the progression of these bars so it’s very important that the increments that they are increased by are paced appropriately to make the gameplay logical and feasible. The player can raise the reputation through dialogue and giving appropriate items they discover on the map. The two other areas of the map are unlocked with certain milestones reached with reputation.
From this perspective, reputation seems to be a fairly important mechanic—this will heavily depend on Alex’s narrative work.
At the current stage, it’s probably not urgent to fully define how the reputation system works. Instead, we should first clarify more “system-level” elements, such as wolf packs, healing items, and similar mechanics. I think we could start by using Erik’s map design to sketch out a rough block-based map, then decide where wolf packs could be placed, and determine where healing items should appear based on the order in which players encounter wolves or other hostile creatures.
On the other hand, we also need to clarify how “combat” is supposed to work (or even whether we really have combat at all). It’s possible that what Erik wants is something like a turn-based RPG similar to FF4 or Octopath Traveler, or perhaps a system that relies purely on probability-based triggers instead.
I’ve been thinking about what game sound design should actually look like if I were to work on it.
Up to now, my personal work has been mainly focused on Foley and sound design for film and video. But game audio feels fundamentally different from film audio. If you simply drag sound files into the engine, the result often feels like the audio was added in post—detached from gameplay, not interacting with mechanics, not participating in the narrative, not affecting game feel. It just tells you “there is a sound here,” much like early beta versions of Minecraft.
In film sound design, the progression of sound is relatively linear: everything follows a timeline.
I think I’m currently stuck on the question of what kind of sound design can be considered truly integrated into the game experience.
That said, considering the use of a bitcrusher, perhaps what Erik wants isn’t a fully realistic soundscape. However, if the art direction ultimately moves away from a pixel-art style (since Yansong is considering 3D models with a 2.5D fixed camera, and it’s hard to predict the final visual outcome), a bitcrusher could end up feeling very out of place. This is something I’ll need to discuss with the TA.
I’ve also been thinking that if this is a third-person game with a fixed top-down camera, adding a large amount of close-range character detail sounds (such as cloth movement or small bodily motions) would pull the listener’s focus back toward the character’s body. This would reduce environmental depth and make the space sound “flat”and overly intimate, potentially causing confusion in spatial perception.
For this reason, I personally think that in this scenario, environmental audio should take priority over character-generated sounds.
Although this issue isn’t critical at the moment, I’ve briefly thought about some potential problems that might arise in sound design. This is more of a quick brainstorm than a finalized conclusion.
Overly loud or overly bright sound effects
Pushing every sound effect to maximum volume, with excessive high frequencies, can make the mix feel harsh and disconnected. It’s important to leave proper headroom for each sound effect and apply appropriate high-frequency attenuation.
Using only primary sound effects without supporting layers
For example, when a character lands after a jump, using only the “impact” sound without a subtle “falling air” or motion sound can make the action feel thin and overly “skin-tight” to the animation.
Lack of differentiation between environments
Using the same reverb or ambient noise for forests, deserts, and indoor spaces removes environmental identity from the sound. Without scene-specific acoustic characteristics, sound effects tend to feel pasted onto the visuals rather than integrated.
After implementation, try turning off the visuals and listening to the audio alone.
If the player can clearly tell what the character is doing, what environment they are in, and what the emotional tone is, then the sound design is successfully integrated, and that artificial, “stuck-on” feeling naturally disappears.