-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Mumble positional audio #415
Comments
I'm not sure what OpenAL Soft would do for this to help Mumble integration. OpenAL only knows the position of active sound sources (or sound sources that the engine may play when necessary), but not game entities like players that may produce sound at some point. What would OpenAL need to do to help? |
A single Mumble client only needs to know the position of the avatar of the player or more precisely the listener's position. If you look at the source code for the Mumble Link plugin [1] it may become a bit clearer. |
I see. But still, it seems to want information OpenAL simply doesn't have:
OpenAL doesn't have any way to generate unique IDs for everyone in a given game session, nor a context blob to indicate who should hear the given player/listener as positional. More generally, I'm also seeing a lack of synchronization with the given example code, how is whoever reads the shared memory supposed to avoid reading a partial update? |
As far as I know identity and context are optional. If you leave them empty the positional audio should still work. This is what I was talking about in
Good question, I don't know if this case is handled properly. I realized that EFX support as I was talking about in the first post wouldn't be possible without a change to the protocol. For some reason I thought the context field could be used for this but having two players in different EFX environments would disable positional audio because the context field would no longer match. So EFX support would require a change to the Mumble protocol. There is a related issue about enhancing the positional audio feature [1]. I forgot to mention that there is a helper tool [2] to test the positional audio feature of Mumble. [1] mumble-voip/mumble#3234 |
From what the wiki says, the context is needed to know who on a server should hear you positionally. If some generic context is given, it would always match everyone else so everyone on the same Mumble server using OpenAL Soft would hear each other positionally, even when playing different games (or different instances or levels/maps of the same game). But either way, there's another issue that if an app has Mumble support itself while OpenAL Soft also tries to access Mumble, that would create a conflict. The only way it could possible work is if the app knows about Mumble to tell OpenAL it doesn't use Mumble. That, on top of the apparent synchronization issue.
For this kind of thing, it needs to be tied into the game logic. EFX is more than just applying a reverb to what the listener hears. An app generally may only use a preset reverb environment that it occasionally changes (if any at all), but it's entirely possible for it to use a few that it dynamically updates to create a more detailed audio scene. You also don't want to use a reverb per-source, but have a small set number of active reverbs (1 to 3 or 4 at most, associated with particular environments near the listener) that sources feed into, with filters to obstruct sounds given the map geometry and materials. OpenAL Soft doesn't know how sound sources apply to reverb environments except by what the app tells it to do at a given point in time, so the app would have to control the properties of the source that's playing the voice stream. |
A simple solution would be to only have players playing the same game and map in a particular chatroom. Not convenient but not too bad either.
This also came to my mind. One possible solution would be to deactivate Mumble positional audio by default and only turn it on with a setting in alsoft.ini. I think if a game has alsoft.ini next to it executable this file takes precedence over the other alsoft configuration files. This mechanic could be used to enable Mumble positional audio for specific games.
I understand. Well, it was worth a try. |
Positional audio "identifcaiton" is a game logic aspect, not an audio api one. |
To do positional audio, Mumble needs to know the player's position in the virtual world. This information is also used by OpenAL to determine the listerner's position. However, at least the following information would be missing in OpenAL:
Additionally, positional information would be incorrect if the listener's position doesn't equal the avatar's position (e.g. when the listener's position is bound to the camera in third person view or cinematics). I'll close this issue since only the minimal feature set of Mumble's positional audio could possibly be supported and there might even be a discrepancy between the listener's position and the avatar's position. |
Ironically enough, that's already a normal problem to handle even without the whole "voip" aspect (so I don't think that should be a problem more than it isn't already for normal audio output, which has in fact some knobs to try to put up with it). Anyhow, just wanted to point out that I just realized that at least some old EAX game.. did use to have some kind of "predisposition" about voice transmission (even though how it worked is very unclear). |
@mirh There is a similar video of Thief2x where EAX Voice has probably been used. I guess the Creative driver just applies currently active EAX effects to the input stream. |
Yes, that totally seems to be it and it should have been compatible with every EAX game (you could control the strength with the "Mic environment FX" slider in the X-Fi control panel). But then EAX voice is also this supposed “3D Voice Over IP” thing? And it really seems like on point to this feature request. I couldn't find any report of such thing ever being used though, assuming it even ever shipped in the drivers in the first place. |
Could be an internal API. I don't see in the specification mentioning of EAX Voice, VoIP or similar features. |
The way how the 3D Voice over IP feature is described, it might actually refer to a game-engine feature decoupled from Creative's driver or EAX. They might mean the following: If a game has an internal 3D VoIP feature (like many do) and you additionally enable EAX Voice, the other players can hear your voice with EAX effects since they are simply applied to your microphone stream. Should I reopen this issue and rename it |
The feature sounds stated separately, and they underline how it allows proper directionality of communications (whether "talk too loud your enemies will hear you" is true or just rhetorical is unclear, but at least it should work with squad mates). It's true though that this might just be a game totally internal thing then (and this may even be why, for once, they aren't subfixing the feature with the n-th ®).. But was that the case, then shouldn't it already work with just eax enabled in openal-soft/dsoal? And maybe from that answer it should also depend how the far simpler EAX voice gets implemented (which for as much as cheap, trashy and exaggerated it may be.. it does sound as a cool idea eventually). Like, of course it shouldn't be too hard to implement inside of openal-soft capture.. but then of course the question is how could mumble even access it (can the game context be shared? can you implement an interface that can be universally hooked for every game? would you eventually still need a dedicated plugin for each one?) |
@mirh I'd be nicer about the people doing favors 👀💦 I really like the positional 3D voice chat idea, but tbh hearing myself would probably get annoying real fast so I'd rather just hear the subtle reverb/echo without the "direct sound", while teammates obviously hear both direct sound and reverberation. I can't wait for Mumble to get OpenAL. The positional surround mix is atrocious (massive leaking that just seems like stereo repeat with subtle attenuation, which results in awful positioning) and the headphones mode just sounds like crossfeed. And it's not any better in-game either. Sector's Edge is the only game I know that pulled off the whole shebang (proximity, 3D HRTF, reverb, occlusion, etc). On a side note, apparently UT2004 requires an OpenAL patch to fix voice chat, so I wonder if it could be fixed on OpenAL Soft's end. @kcat Have you looked into this by any chance? |
Depending on how the effects are being applied, it should be able to work with any EAX version. If all it's doing is using capture to record voices from users, it would simply play what it's capturing as any other sound. I don't know what "EAX Voice" is, whether it's some hardware feature or something added to some middleware, but the idea of playing back voice chatter in 3D with environmental effects is not much different than any other 3D sound. Just capture in mono, and stream it back out as a mono buffer (and stream it to other players for voice chat), which can be treated as any other mono sound. The only difference is the audio source being a capture device or a stream from another player instead of a file, all the same 3D effects can still apply. You could even filter out the direct path of the local player only, to only hear your own reverberation in game while other players get both the direct path and reverb of your voice. Though that would sound odd if you make recordings, your direct capture isn't being mixed into the recording so the video would also only have the reverb and not your direct voice.
Those are some odd changes. Allows modifying a buffer while it's attached to a source (out of spec, and pretty dangerous). Adds a buffer query to get a source ID that it's attached to, even though a buffer can be attached to multiple sources. Not sure where that comes from since it doesn't seem to be used... maybe an old query that was being contemplated but never made official, and UT2k4 used despite never being official? And also a couple hacks to always return 9 queued buffers for non-playing sources, and return all buffers processed for sources in an |
The "Microphone Environment FX" part of eax voice is going to work with any game that has EAX reverb, that is already mentioned (on EAX4+ it should also support Multi-Environment® but I digress). What I was really bewildered about was the other feature, which is the so called "3D voice over ip".
I would hold your horses for that. While I guess that's one way things could be smoothed out, that doesn't seem required or even necessarily preferable.
mumble-voip/mumble#1933
That's interesting and might or might not be related.. A patch straight from creative certainly sounds like the place you could expect this. On the other hand I couldn't find much inside the v3369 ALaudio.dll, except of checking for EDIT: uh, btw, are these quirks still a thing? |
@kcat Then was that patch just a bunch of desperate (even some useless?) attempts to fix it or just the adequate solution to an insane OpenAL implementation? I'm not sure if this also affects other libraries, though. So could OpenAL Soft implement a proper fix or would that break other games, unless the patch only applied to UT2004.exe or a game_compat flag? I'm just worried we might need to stick to that old patch for proper OpenAL and avoid VC crashes in that game @mirh I hope this applied the other players' reverb locally instead of transmitting the voice with the reverb baked-in, because I doubt possibly lossy compression would've done reverb quality any favors |
Don't know. I imagine at least some of it is trying to work around bugs in the game that relying on non-standard behavior. But how much of it is actually necessary, I can't say (I don't imagine reducing the sample count in alcCaptureOpenDevice is).
The changes would very likely break other games, as it assumes a buffer queue size for non-playing sources, and returns incorrect processed buffer counts for |
The patch for UT2004 voice chat was highly specific to UT2004 patch 3369. Since the game is a closed source commercial app, there was not much hope with fixing the bugs inside the engine. It is absolutely a hack and should not be used with other games. It was made by creating a debug build of OpenAL, setting breakpoints in Visual Studio, and inspecting the audio buffers and hard coding some values based on what the UT2004 engine was doing vs what was expected inside OpenAL. The UT2004 engine was expecting and ignoring a certain buffer size, which is where the hardcoded '9' comes from. It was also using byte size instead of sample size when allocating a buffer, which is where the divide by 4 came from. It was bad coding in UT2004, which somehow worked in older versions of OpenAL, but were broken since. |
@zenakuten Was it still using bad code with the X-Fi/EAX 5.0 patch by Creative? |
Not sure, I never tested with Creative's build. The voice chat issue was only with the 64 bit build of UT2004. I'm not sure Creative released a 64 bit build. Voice chat worked fine with the 32 bit version. |
Oh, I thought that 3369 was the creative's patch build number, instead it's just the last one generically. Back to us though, a lightbulb turned on in my mind. And after browsing the code (to be sure it isn't the one of the EAX patch, but still), well.. it really seems like UT2004 did have voice spatialization ready already in the base game, provided that you enabled that ALAudio setting I mentioned in my previous message. Though I believe that the server only signals it can accept the extra information, if the actor's "active room" is local. With all this said then (and assumed for the sake of the argument) it appears that 3D voip is independent from eax and eax voice. Could it really be that you can insert in the sound stage extra sound sources from the outside (and safely)? Fun fact: there's no mention whatsoever about EAX5.0 into the X-Fi patch (while there's a reference to a certain CISACTAudioDrv) |
Ok I feel like I'm a moron, and the last function of the "unreal datachannel implementation" (
|
UT has three voice chat modes - Public, Team, and Local. Admins always send to the Public channel. Public and Team are never spatialized since everyone is supposed to hear you regardless of in-game distance. Local is meant to play to those near you and volume should fall off as distance increases. |
AFAIU spatialization should also be positioning, not just attenuation. And yes, that distinction makes sense and I can guess it's the first part of my conditional. The question here though (be as it is that UT2004 is the only known game with anything remotely similar) is what EAX voice is.
Like.. It would be just so fucking easy to say "eax voice is just bragging that you can do in-game voip with Microphone Environment FX" (i.e. the thing that applies EAX effects to your own audio input). |
Easier to sell new hardware if it's couched in language that sounds like newer hardware is required, making people think they and their friends need to upgrade their audio card for it. That text blurb reeks of marketing speak, and is self-contradictory. It works by "feeding the microphone input into the EAX® hardware effects engine and at that point becomes another gaming sound element", "then transmitted along with the environment properties over the LAN", "their EAX ADVANCED HD 5.0 compliant card translates the environment properties and adds them to your voice in real-time!" So... which is it? Does it apply the effect to the microphone input on the speaker's end, which is then transmitted to other players, or are the environment properties sent over the LAN with the plain voice stream for the other player's card to apply the effect to your voice in real-time on their end? And for the other player, "their EAX ADVANCED HD 5.0 compliant card translates the environment properties and adds them to your voice in real-time", but it works with games that "support the 3D voice transmission and EAX® ADVANCED HD™ 4.0 or above". So... does it need EAX 5 or EAX 4? It really sounds like some marketers trying to up-sell newer hardware, without knowing (or not wanting to divulge) how it actually works; keep it vague and mysterious to make it seem like it's new magic added by new hardware that can't possibly work on what you already have. To be fair, EAX 4.0 is needed for multiple reverb environments. EAX 3 and earlier could only set a single reverb environment at a time, and each source can only choose to send to it at some gain level. The voice stream can't have a separate reverb for a different environment than the one the listener is in prior to EAX 4. Though even there, EAX 4 and 5 are limited to 4 distinct simultaneous effects, so there's still a limit to the number of voice streams that can apply their intended reverb. |
I actually think that they aren't contradicting, because with an incredible lot of charity you may interpret the first sentence to just describe what happens to your voice for "local purposes" (could it be that "mic fx" doesn't so much interact with the game, as it rather inserts your own voice into the final reverb mix that is sent to the speakers?) while the second describes how it's transmitted remotely. While "environment properties" may or may not be a reference to the "Environmental Audio Extensions", and could mean pretty much anything (including literally just the position, or perhaps some vector from the player).
It's just nuts to try to understand whether an EAX-over-IP feature existed just in the mind of some overzealous marketer/journalist, or for real (perhaps in the much coveted and legendary EAX 5.0 SDK). If not any it could somehow explain the only somewhat clear technical detail of the setup which is playing over LAN (like, I could reckon they just hacked up some bullshit together, which needed far too much network bandwidth for 2006 DSL not to start lagging). But we don't have to care. Mumble already supports positional audio, and even if you had to constantly stream the EAX room setting (because, say, games are player-centric and they weren't coded to track anything more than the footsteps of others?), that in itself sounds fairly trivial. What is actually the super hardest nut to crack, is the other thing: microphone environment FX, hooking up the game scene.
Not sure if you would even want to sound like in a cave while you are in unrelated calls, but I guess they aren't even exclusive. |
Mumble [1] is a FOSS voice chat software. One particular feature of it is that it can be hooked up to games/applications and transmit positional data of a player's avatar so that all participants hear speech from other players as if it was coming from another player's avatar. Currently there exist two options that I'm aware of to make a game compatible with this feature:
I'm now wondering if it would be possible and make sense to integrate Mumble positional audio support into OpenAL Soft. As an audio library it already has access to the most important data. The Mumble protocol has fields for additional metadata which could for example be EFX parameters to play speech with EFX effects (although this would require support in the Mumble client, e.g. through a OpenAL backend which has been proposed but not included yet [3]). This would enable something similar to the EAX voice feature demonstrated here [4].
Besides positional data contextual information is often used with Mumble to e.g. separate positional audio for competing teams in the same chat room. This would be data OpenAL Soft wouldn't have access to and thus wouldn't be able to provide. IMO this wouldn't be a dealbreaker, many supported games don't support this feature anyway [5].
Integrating Mumble positional audio support in OpenAL Soft would however make positional audio in Mumble available to many games at once without further modification.
[1] https://www.mumble.info/
[2] https://wiki.mumble.info/wiki/Link
[3] mumble-voip/mumble#1933
[4] https://www.youtube.com/watch?v=30fTc5t5QNU
[5] https://wiki.mumble.info/wiki/Games#Supported_games
The text was updated successfully, but these errors were encountered: