Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Mumble positional audio #415

Closed
Hiradur opened this issue Apr 12, 2020 · 29 comments
Closed

Support for Mumble positional audio #415

Hiradur opened this issue Apr 12, 2020 · 29 comments

Comments

@Hiradur
Copy link

Hiradur commented Apr 12, 2020

Mumble [1] is a FOSS voice chat software. One particular feature of it is that it can be hooked up to games/applications and transmit positional data of a player's avatar so that all participants hear speech from other players as if it was coming from another player's avatar. Currently there exist two options that I'm aware of to make a game compatible with this feature:

  • Integrate the Mumble Link Plugin [2] into the source code -> requires code access
  • Find out the memory addresses of positional data and have Mumble pull the data from there -> requires time consuming address hunting and potentially an update of the memory addresses with every game update

I'm now wondering if it would be possible and make sense to integrate Mumble positional audio support into OpenAL Soft. As an audio library it already has access to the most important data. The Mumble protocol has fields for additional metadata which could for example be EFX parameters to play speech with EFX effects (although this would require support in the Mumble client, e.g. through a OpenAL backend which has been proposed but not included yet [3]). This would enable something similar to the EAX voice feature demonstrated here [4].

Besides positional data contextual information is often used with Mumble to e.g. separate positional audio for competing teams in the same chat room. This would be data OpenAL Soft wouldn't have access to and thus wouldn't be able to provide. IMO this wouldn't be a dealbreaker, many supported games don't support this feature anyway [5].

Integrating Mumble positional audio support in OpenAL Soft would however make positional audio in Mumble available to many games at once without further modification.

[1] https://www.mumble.info/
[2] https://wiki.mumble.info/wiki/Link
[3] mumble-voip/mumble#1933
[4] https://www.youtube.com/watch?v=30fTc5t5QNU
[5] https://wiki.mumble.info/wiki/Games#Supported_games

@kcat
Copy link
Owner

kcat commented Apr 12, 2020

I'm not sure what OpenAL Soft would do for this to help Mumble integration. OpenAL only knows the position of active sound sources (or sound sources that the engine may play when necessary), but not game entities like players that may produce sound at some point. What would OpenAL need to do to help?

@Hiradur
Copy link
Author

Hiradur commented Apr 13, 2020

A single Mumble client only needs to know the position of the avatar of the player or more precisely the listener's position.
The positions of other participants of a chat room are received from the other clients. The Mumble client then knows the position of its corresponding player as well as the positions of all other participants and uses this to render each participant's speech with positional audio.

If you look at the source code for the Mumble Link plugin [1] it may become a bit clearer.

[1] https://wiki.mumble.info/wiki/Link

@kcat
Copy link
Owner

kcat commented Apr 13, 2020

I see. But still, it seems to want information OpenAL simply doesn't have:

// Identifier which uniquely identifies a certain player in a context (e.g. the ingame name).
wcsncpy(lm->identity, L"Unique ID", 256);
// Context should be equal for players which should be able to hear each other positional and
// differ for those who shouldn't (e.g. it could contain the server+port and team)
memcpy(lm->context, "ContextBlob\x00\x01\x02\x03\x04", 16);

OpenAL doesn't have any way to generate unique IDs for everyone in a given game session, nor a context blob to indicate who should hear the given player/listener as positional. More generally, I'm also seeing a lack of synchronization with the given example code, how is whoever reads the shared memory supposed to avoid reading a partial update?

@Hiradur
Copy link
Author

Hiradur commented Apr 14, 2020

As far as I know identity and context are optional. If you leave them empty the positional audio should still work. This is what I was talking about in

Besides positional data contextual information is often used with Mumble to e.g. separate positional audio for competing teams in the same chat room. This would be data OpenAL Soft wouldn't have access to and thus wouldn't be able to provide. IMO this wouldn't be a dealbreaker, many supported games don't support this feature anyway [5].

More generally, I'm also seeing a lack of synchronization with the given example code, how is whoever reads the shared memory supposed to avoid reading a partial update?

Good question, I don't know if this case is handled properly.

I realized that EFX support as I was talking about in the first post wouldn't be possible without a change to the protocol. For some reason I thought the context field could be used for this but having two players in different EFX environments would disable positional audio because the context field would no longer match. So EFX support would require a change to the Mumble protocol. There is a related issue about enhancing the positional audio feature [1].

I forgot to mention that there is a helper tool [2] to test the positional audio feature of Mumble.

[1] mumble-voip/mumble#3234
[2] https://github.com/mumble-voip/mumble-pahelper

@kcat
Copy link
Owner

kcat commented Apr 14, 2020

As far as I know identity and context are optional. If you leave them empty the positional audio should still work.

From what the wiki says, the context is needed to know who on a server should hear you positionally. If some generic context is given, it would always match everyone else so everyone on the same Mumble server using OpenAL Soft would hear each other positionally, even when playing different games (or different instances or levels/maps of the same game).

But either way, there's another issue that if an app has Mumble support itself while OpenAL Soft also tries to access Mumble, that would create a conflict. The only way it could possible work is if the app knows about Mumble to tell OpenAL it doesn't use Mumble. That, on top of the apparent synchronization issue.

I realized that EFX support as I was talking about in the first post wouldn't be possible without a change to the protocol. For some reason I thought the context field could be used for this but having two players in different EFX environments would disable positional audio because the context field would no longer match.

For this kind of thing, it needs to be tied into the game logic. EFX is more than just applying a reverb to what the listener hears. An app generally may only use a preset reverb environment that it occasionally changes (if any at all), but it's entirely possible for it to use a few that it dynamically updates to create a more detailed audio scene. You also don't want to use a reverb per-source, but have a small set number of active reverbs (1 to 3 or 4 at most, associated with particular environments near the listener) that sources feed into, with filters to obstruct sounds given the map geometry and materials. OpenAL Soft doesn't know how sound sources apply to reverb environments except by what the app tells it to do at a given point in time, so the app would have to control the properties of the source that's playing the voice stream.

@Hiradur
Copy link
Author

Hiradur commented Apr 15, 2020

From what the wiki says, the context is needed to know who on a server should hear you positionally. If some generic context is given, it would always match everyone else so everyone on the same Mumble server using OpenAL Soft would hear each other positionally, even when playing different games (or different instances or levels/maps of the same game).

A simple solution would be to only have players playing the same game and map in a particular chatroom. Not convenient but not too bad either.

But either way, there's another issue that if an app has Mumble support itself while OpenAL Soft also tries to access Mumble, that would create a conflict. The only way it could possible work is if the app knows about Mumble to tell OpenAL it doesn't use Mumble.

This also came to my mind. One possible solution would be to deactivate Mumble positional audio by default and only turn it on with a setting in alsoft.ini. I think if a game has alsoft.ini next to it executable this file takes precedence over the other alsoft configuration files. This mechanic could be used to enable Mumble positional audio for specific games.

For this kind of thing, it [EFX] needs to be tied into the game logic

I understand. Well, it was worth a try.

@mirh
Copy link

mirh commented Jun 15, 2021

Positional audio "identifcaiton" is a game logic aspect, not an audio api one.
I'm still scrambling for a sense here.
(besides, there's an awfully small number of openal games too)

@Hiradur
Copy link
Author

Hiradur commented Jun 26, 2021

To do positional audio, Mumble needs to know the player's position in the virtual world. This information is also used by OpenAL to determine the listerner's position.
So I guess games hand off the same vectors to both OpenAL and Mumble and I figured if that information was redundant then OpenAL Soft might as well integrate Mumble positional audio support. This would eliminate the need to have to integrate Mumble positional audio support into every game or write Mumble plugins that extract the necessary information from the game's memory.

However, at least the following information would be missing in OpenAL:

  • listeners orientation
  • identity (as defined by Mumble)
  • context (as defined by Mumble)

Additionally, positional information would be incorrect if the listener's position doesn't equal the avatar's position (e.g. when the listener's position is bound to the camera in third person view or cinematics).

I'll close this issue since only the minimal feature set of Mumble's positional audio could possibly be supported and there might even be a discrepancy between the listener's position and the avatar's position.

@Hiradur Hiradur closed this as completed Jun 26, 2021
@mirh
Copy link

mirh commented Aug 24, 2024

there might even be a discrepancy between the listener's position and the avatar's position.

Ironically enough, that's already a normal problem to handle even without the whole "voip" aspect (so I don't think that should be a problem more than it isn't already for normal audio output, which has in fact some knobs to try to put up with it).

Anyhow, just wanted to point out that I just realized that at least some old EAX game.. did use to have some kind of "predisposition" about voice transmission (even though how it worked is very unclear).
https://www.youtube.com/watch?v=30fTc5t5QNU
https://web.archive.org/web/20070601013325/http://www.soundblaster.com/eax/abouteax/eax5ahd/eax5_2.asp

@Hiradur
Copy link
Author

Hiradur commented Sep 10, 2024

@mirh There is a similar video of Thief2x where EAX Voice has probably been used. I guess the Creative driver just applies currently active EAX effects to the input stream.

@mirh
Copy link

mirh commented Sep 10, 2024

Yes, that totally seems to be it and it should have been compatible with every EAX game (you could control the strength with the "Mic environment FX" slider in the X-Fi control panel).

But then EAX voice is also this supposed “3D Voice Over IP” thing? And it really seems like on point to this feature request.
It's presumable that it could only have worked with the in-game voice chat (even though there's no hard evidence and they could have as well hacked.. uh, what was even in vogue back then? TS2 and ventrilo?) but to be sure in 2024 only mumble would make sense and could even hope to do it.

I couldn't find any report of such thing ever being used though, assuming it even ever shipped in the drivers in the first place.
Worse, there aren't exactly many EAX 5 multiplayer games that shipped after 2005 (UT2004, BF2, BF2142, Q4 and FSW?) unless they helped to implement the thing in some godforsaken asian MMO or they backported the feature to the eax 3/4 headers.
So.. @bibendovsky do you know anything about this?

@bibendovsky
Copy link
Contributor

Yes, that totally seems to be it and it should have been compatible with every EAX game (you could control the strength with the "Mic environment FX" slider in the X-Fi control panel).

But then EAX voice is also this supposed “3D Voice Over IP” thing? And it really seems like on point to this feature request. It's presumable that it could only have worked with the in-game voice chat (even though there's no hard evidence and they could have as well hacked.. uh, what was even in vogue back then? TS2 and ventrilo?) but to be sure in 2024 only mumble would make sense and could even hope to do it.

I couldn't find any report of such thing ever being used though, assuming it even ever shipped in the drivers in the first place. Worse, there aren't exactly many EAX 5 multiplayer games that shipped after 2005 (UT2004, BF2, BF2142, Q4 and FSW?) unless they helped to implement the thing in some godforsaken asian MMO or they backported the feature to the eax 3/4 headers. So.. @bibendovsky do you know anything about this?

Could be an internal API. I don't see in the specification mentioning of EAX Voice, VoIP or similar features.

@Hiradur
Copy link
Author

Hiradur commented Sep 14, 2024

But then EAX voice is also this supposed “3D Voice Over IP” thing?

The way how the 3D Voice over IP feature is described, it might actually refer to a game-engine feature decoupled from Creative's driver or EAX. They might mean the following: If a game has an internal 3D VoIP feature (like many do) and you additionally enable EAX Voice, the other players can hear your voice with EAX effects since they are simply applied to your microphone stream.

Should I reopen this issue and rename it Add support for environmental voice processing similar to EAX Voice since this would be something that OpenAL Soft (and DSOAL) could do? Actually, it might be better to create a new issue for this instead.

@mirh
Copy link

mirh commented Sep 14, 2024

The feature sounds stated separately, and they underline how it allows proper directionality of communications (whether "talk too loud your enemies will hear you" is true or just rhetorical is unclear, but at least it should work with squad mates).

It's true though that this might just be a game totally internal thing then (and this may even be why, for once, they aren't subfixing the feature with the n-th ®).. But was that the case, then shouldn't it already work with just eax enabled in openal-soft/dsoal?
@ThreeDeeJay could you have some of your server minions try older multiplayer games with a X-fi for a confirmation?

And maybe from that answer it should also depend how the far simpler EAX voice gets implemented (which for as much as cheap, trashy and exaggerated it may be.. it does sound as a cool idea eventually). Like, of course it shouldn't be too hard to implement inside of openal-soft capture.. but then of course the question is how could mumble even access it (can the game context be shared? can you implement an interface that can be universally hooked for every game? would you eventually still need a dedicated plugin for each one?)

@ThreeDeeJay
Copy link
Contributor

@ThreeDeeJay could you have some of your server minions try older multiplayer games with a X-Fi for a confirmation?

@mirh I'd be nicer about the people doing favors 👀💦
but sure, I can ask around since we were talking about this just the other day.
I thought this feature was exclusive to EAX 5.0 games but Raven Shield is EAX 3.0 and Thief 2 is EAX 2.0 so is the point to find out whether this works in any EAX with voice chat or something more specific?

I really like the positional 3D voice chat idea, but tbh hearing myself would probably get annoying real fast so I'd rather just hear the subtle reverb/echo without the "direct sound", while teammates obviously hear both direct sound and reverberation.

I can't wait for Mumble to get OpenAL. The positional surround mix is atrocious (massive leaking that just seems like stereo repeat with subtle attenuation, which results in awful positioning) and the headphones mode just sounds like crossfeed. And it's not any better in-game either. Sector's Edge is the only game I know that pulled off the whole shebang (proximity, 3D HRTF, reverb, occlusion, etc).

On a side note, apparently UT2004 requires an OpenAL patch to fix voice chat, so I wonder if it could be fixed on OpenAL Soft's end. @kcat Have you looked into this by any chance?

@kcat
Copy link
Owner

kcat commented Sep 15, 2024

I thought this feature was exclusive to EAX 5.0 games but Raven Shield is EAX 3.0 and Thief 2 is EAX 2.0 so is the point to find out whether this works in any EAX with voice chat or something more specific?

Depending on how the effects are being applied, it should be able to work with any EAX version. If all it's doing is using capture to record voices from users, it would simply play what it's capturing as any other sound. I don't know what "EAX Voice" is, whether it's some hardware feature or something added to some middleware, but the idea of playing back voice chatter in 3D with environmental effects is not much different than any other 3D sound. Just capture in mono, and stream it back out as a mono buffer (and stream it to other players for voice chat), which can be treated as any other mono sound. The only difference is the audio source being a capture device or a stream from another player instead of a file, all the same 3D effects can still apply. You could even filter out the direct path of the local player only, to only hear your own reverberation in game while other players get both the direct path and reverb of your voice. Though that would sound odd if you make recordings, your direct capture isn't being mixed into the recording so the video would also only have the reverb and not your direct voice.

On a side note, apparently UT2004 requires an OpenAL patch to fix voice chat, so I wonder if it could be fixed on OpenAL Soft's end. @kcat Have you looked into this by any chance?

Those are some odd changes. Allows modifying a buffer while it's attached to a source (out of spec, and pretty dangerous). Adds a buffer query to get a source ID that it's attached to, even though a buffer can be attached to multiple sources. Not sure where that comes from since it doesn't seem to be used... maybe an old query that was being contemplated but never made official, and UT2k4 used despite never being official? And also a couple hacks to always return 9 queued buffers for non-playing sources, and return all buffers processed for sources in an AL_INITIAL state. And for some reason reduces the requested number of capture samples by 1/4th (which only affects the minimum number of samples the device will be guaranteed to hold; it can still hold more, and it doesn't change how often the app can read the samples, it would only potentially make the device overrun quicker).

@mirh
Copy link

mirh commented Sep 15, 2024

I thought this feature was exclusive to EAX 5.0 games but Raven Shield is EAX 3.0 and Thief 2 is EAX 2.0 so is the point to find out whether this works in any EAX with voice chat or something more specific?

The "Microphone Environment FX" part of eax voice is going to work with any game that has EAX reverb, that is already mentioned (on EAX4+ it should also support Multi-Environment® but I digress).

What I was really bewildered about was the other feature, which is the so called "3D voice over ip".
Of very interesting note that supposedly it may only have worked in LAN matches.

I can't wait for Mumble to get OpenAL.

I would hold your horses for that. While I guess that's one way things could be smoothed out, that doesn't seem required or even necessarily preferable.
Even though now that I think to it, having some kind of openal-awareness could probably simplify a lot writing plugins (at least for those games that use it, that is).

The positional surround mix is atrocious

mumble-voip/mumble#1933
Mhh, ok, I see there's really a lot that they have to improve.

On a side note, apparently UT2004 requires an OpenAL patch to fix voice chat

That's interesting and might or might not be related.. A patch straight from creative certainly sounds like the place you could expect this. On the other hand I couldn't find much inside the v3369 ALaudio.dll, except of checking for ALC_EXT_CAPTURE (of course) and then the existence of UseSpatializedVoice and SpatializedVoiceRadius settings.

EDIT: uh, btw, are these quirks still a thing?

@ThreeDeeJay
Copy link
Contributor

@kcat Then was that patch just a bunch of desperate (even some useless?) attempts to fix it or just the adequate solution to an insane OpenAL implementation? I'm not sure if this also affects other libraries, though. So could OpenAL Soft implement a proper fix or would that break other games, unless the patch only applied to UT2004.exe or a game_compat flag? I'm just worried we might need to stick to that old patch for proper OpenAL and avoid VC crashes in that game

@mirh I hope this applied the other players' reverb locally instead of transmitting the voice with the reverb baked-in, because I doubt possibly lossy compression would've done reverb quality any favors

@kcat
Copy link
Owner

kcat commented Sep 15, 2024

@kcat Then was that patch just a bunch of desperate (even some useless?) attempts to fix it or just the adequate solution to an insane OpenAL implementation?

Don't know. I imagine at least some of it is trying to work around bugs in the game that relying on non-standard behavior. But how much of it is actually necessary, I can't say (I don't imagine reducing the sample count in alcCaptureOpenDevice is).

So could OpenAL Soft implement a proper fix or would that break other games, unless the patch only applied to UT2004.exe or a game_compat flag? I'm just worried we might need to stick to that old patch for proper OpenAL and avoid VC crashes in that game

The changes would very likely break other games, as it assumes a buffer queue size for non-playing sources, and returns incorrect processed buffer counts for AL_INITIAL sources. And opens up the possibility of invalid memory access if a buffer is loaded with new samples while in use.

@mirh
Copy link

mirh commented Sep 15, 2024

@zenakuten

@zenakuten
Copy link

The patch for UT2004 voice chat was highly specific to UT2004 patch 3369. Since the game is a closed source commercial app, there was not much hope with fixing the bugs inside the engine. It is absolutely a hack and should not be used with other games. It was made by creating a debug build of OpenAL, setting breakpoints in Visual Studio, and inspecting the audio buffers and hard coding some values based on what the UT2004 engine was doing vs what was expected inside OpenAL. The UT2004 engine was expecting and ignoring a certain buffer size, which is where the hardcoded '9' comes from. It was also using byte size instead of sample size when allocating a buffer, which is where the divide by 4 came from. It was bad coding in UT2004, which somehow worked in older versions of OpenAL, but were broken since.

@ThreeDeeJay
Copy link
Contributor

@zenakuten Was it still using bad code with the X-Fi/EAX 5.0 patch by Creative?
https://web.archive.org/web/20060716054224/http://images.soundblaster.com/downloads/SBXF_UTPATCH_US_3369.exe

@zenakuten
Copy link

Not sure, I never tested with Creative's build. The voice chat issue was only with the 64 bit build of UT2004. I'm not sure Creative released a 64 bit build. Voice chat worked fine with the 32 bit version.

@mirh
Copy link

mirh commented Sep 18, 2024

Oh, I thought that 3369 was the creative's patch build number, instead it's just the last one generically.
In that case fellas, not sure how to tell you.. But it's not like the sources of your closed commercial program are that hard to find. As far as voice is concerned I could for example notice that they aren't enabling PACK_SPEEX_TO_EIGHT_BYTES in win64 builds.


Back to us though, a lightbulb turned on in my mind. And after browsing the code (to be sure it isn't the one of the EAX patch, but still), well.. it really seems like UT2004 did have voice spatialization ready already in the base game, provided that you enabled that ALAudio setting I mentioned in my previous message.

Though I believe that the server only signals it can accept the extra information, if the actor's "active room" is local. Something that in turn I think should correspond to bAllowLocalBroadcast (let alone that somehow its "local voice chat channel to broadcast to all players in the immediate vicinity" description was translated as "enable spatialization on server" in korean). As you can see Engine.VoiceChatReplicationInfo also holds a lot of pretty pertinent entries.
Ok nevermind, it seems like the boolean that should have satisfied the check for the only place where VOICE_AllowSpatialization gets ever set on the server was never implemented. So, given it sounds so stupid to write all of this and then leaving it as dead code, I can only remotely assume/hope that the X-Fi patch was needed to finally unlock this power (note: LocalBroadcastRange and DefaultBroadcastRadius should still work in the local channels, even without "directionality").
This might also explain why I couldn't find anything caring for LAN play (except audio codec selection, but that shouldn't matter).
Somebody please help testing the hypothesis please >__>

With all this said then (and assumed for the sake of the argument) it appears that 3D voip is independent from eax and eax voice.
Conversely, the fact that EAX voice could integrate into some more or less pre-existing voip pipeline means they are really interacting with games as opposed to just keeping everything in-driver (and that they weren't joking when they said enemies could hear you because your voice is added inside the game). Like, they make it seems like they attach "effects primitives" to the transmission (of the normal voice) which are then rendered on each near supporting client.

Could it really be that you can insert in the sound stage extra sound sources from the outside (and safely)?
It seems so crazily advanced, and even altogether odd tbh (because we'd be talking about the client deciding what data to send to the server).
..or it could just be the mother of all vendor locks and marketing bullshit? (i.e. it's not that you need EAX4 because the "environmental properties" have to be decoded by the receiver and you need multi-environment not to disrupt the effects you are already listening to.. but it's just that creative decided and pretended so)
Seriously, test plz.

Fun fact: there's no mention whatsoever about EAX5.0 into the X-Fi patch (while there's a reference to a certain CISACTAudioDrv)

@mirh
Copy link

mirh commented Dec 31, 2024

Ok I feel like I'm a moron, and the last function of the "unreal datachannel implementation" (UVoiceChannel::AllowVoiceTransmission) is actually concerned with sending the stream only if the other player is into the same active room, or if we are logged as admin. VOICE_AllowSpatialization is cleared for the Actor=AdminManager case, but then (among others) it is set back again if Actor->ActiveRoom->bLocal && Actor->Pawn && Dest->Pawn.

[ALAudio.ALAudioSubsystem]UseSpatializedVoice should presumably just be the only knob needed. Still, I wonder if they will carry anything that isn't just the normal microphone audio?

@zenakuten
Copy link

UT has three voice chat modes - Public, Team, and Local. Admins always send to the Public channel. Public and Team are never spatialized since everyone is supposed to hear you regardless of in-game distance. Local is meant to play to those near you and volume should fall off as distance increases.

@mirh
Copy link

mirh commented Jan 1, 2025

AFAIU spatialization should also be positioning, not just attenuation. And yes, that distinction makes sense and I can guess it's the first part of my conditional.

The question here though (be as it is that UT2004 is the only known game with anything remotely similar) is what EAX voice is.

EAX® Voice allows you to literally become part of the game! Using a microphone connected to an EAX ADVANCED HD 5.0 compliant audio device in any game title, you can speak and hear your voice with the same effects as the environment your character is in. Furthermore if the game title supports 3D Voice Over IP then other players in the game will be able to hear your voice as you do and coming from the correct direction! But be careful, this effect is so realistic if you talk too loud your enemies will hear you. Add Multi-Environment® supports and they'll even know exactly which room you're hiding in!

EAX® Voice works by feeding the microphone input into the EAX® hardware effects engine and at that point becomes another gaming sound element which you can hear. It is then transmitted along with the environment properties over the LAN (3D Voice Over IP not currently supported over internet connections) and when it reaches another players PC system their EAX ADVANCED HD 5.0 compliant card translates the environment properties and adds them to your voice in real-time! In order for this feature to work the game must support the 3D voice transmission and EAX® ADVANCED HD™ 4.0 or above.

Like.. It would be just so fucking easy to say "eax voice is just bragging that you can do in-game voip with Microphone Environment FX" (i.e. the thing that applies EAX effects to your own audio input).
But the way it is worded here seems just SO specifically to exclude something so simple, even going as far as to suggest some kind of decoding would happen on other clients.

@kcat
Copy link
Owner

kcat commented Jan 1, 2025

Like.. It would be just so fucking easy to say "eax voice is just bragging that you can do in-game voip with Microphone Environment FX" (i.e. the thing that applies EAX effects to your own audio input).
But the way it is worded here seems just SO specifically to exclude something so simple, even going as far as to suggest some kind of decoding would happen on other clients.

Easier to sell new hardware if it's couched in language that sounds like newer hardware is required, making people think they and their friends need to upgrade their audio card for it. That text blurb reeks of marketing speak, and is self-contradictory. It works by "feeding the microphone input into the EAX® hardware effects engine and at that point becomes another gaming sound element", "then transmitted along with the environment properties over the LAN", "their EAX ADVANCED HD 5.0 compliant card translates the environment properties and adds them to your voice in real-time!" So... which is it? Does it apply the effect to the microphone input on the speaker's end, which is then transmitted to other players, or are the environment properties sent over the LAN with the plain voice stream for the other player's card to apply the effect to your voice in real-time on their end?

And for the other player, "their EAX ADVANCED HD 5.0 compliant card translates the environment properties and adds them to your voice in real-time", but it works with games that "support the 3D voice transmission and EAX® ADVANCED HD™ 4.0 or above". So... does it need EAX 5 or EAX 4? It really sounds like some marketers trying to up-sell newer hardware, without knowing (or not wanting to divulge) how it actually works; keep it vague and mysterious to make it seem like it's new magic added by new hardware that can't possibly work on what you already have.

To be fair, EAX 4.0 is needed for multiple reverb environments. EAX 3 and earlier could only set a single reverb environment at a time, and each source can only choose to send to it at some gain level. The voice stream can't have a separate reverb for a different environment than the one the listener is in prior to EAX 4. Though even there, EAX 4 and 5 are limited to 4 distinct simultaneous effects, so there's still a limit to the number of voice streams that can apply their intended reverb.

@mirh
Copy link

mirh commented Jan 2, 2025

I actually think that they aren't contradicting, because with an incredible lot of charity you may interpret the first sentence to just describe what happens to your voice for "local purposes" (could it be that "mic fx" doesn't so much interact with the game, as it rather inserts your own voice into the final reverb mix that is sent to the speakers?) while the second describes how it's transmitted remotely.

While "environment properties" may or may not be a reference to the "Environmental Audio Extensions", and could mean pretty much anything (including literally just the position, or perhaps some vector from the player).

Developers of LAN based multiplayer games can include options for players voice streams to be transmitted in 3D and linked to their player character in the game. With the combination of localized 3D voice streams and EAX voice effects multiplayer games can take on a whole new level of realism and excitement.

After watching their entire featurette video in the X-Fi demo disc though (aka the one on their youtube channel I never finished), my expectations really deflated. Ok so, like, let's just go to the core of our issue. We aren't really here for history.

It's just nuts to try to understand whether an EAX-over-IP feature existed just in the mind of some overzealous marketer/journalist, or for real (perhaps in the much coveted and legendary EAX 5.0 SDK). If not any it could somehow explain the only somewhat clear technical detail of the setup which is playing over LAN (like, I could reckon they just hacked up some bullshit together, which needed far too much network bandwidth for 2006 DSL not to start lagging). But we don't have to care.

Mumble already supports positional audio, and even if you had to constantly stream the EAX room setting (because, say, games are player-centric and they weren't coded to track anything more than the footsteps of others?), that in itself sounds fairly trivial.

What is actually the super hardest nut to crack, is the other thing: microphone environment FX, hooking up the game scene.
We know now that was all done by the device driver back in the days, but since we don't control it (nor the game) it's a whole new can of worms. In fact, depending on one's objectives and even coding skills you could:

  • extend mumble plugins (I suppose this could be slightly easier and more flexible)
  • write an APO (if even possible at all, this could even work in discord or skype)

Not sure if you would even want to sound like in a cave while you are in unrelated calls, but I guess they aren't even exclusive.
Regardless, both would need openAL some kind of interface to the game internal audio representation, that could easily be accessed from the outside (idk how IPC could work in either case).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants