-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ADR 253: Exposing GLTF internals #273
base: main
Are you sure you want to change the base?
Conversation
Deploying adr with Cloudflare Pages
|
In Explorer Team we discussed it some time ago and revealed many concerns for the original proposals. I will try to recap the majority of them:
It's too heavy data to provide per se. Every possible GLTF can contain an unlimited number of such entries.
What we think is more reasonable to do as a first iteration:
|
We can specify the As much for .gltf as for .glb: nodes, meshes, textures, and materials structure is provided in JSON format, you don't need to know any hierarchy to provide this information; it would just be pipe the already available data. You don't need to recollect it. About the uncontrollable allocations and huge memory footprint, I think it's far extremist to believe that about a bunch of strings. Let's say, 1000 nodes, 100 textures, 100 meshes, and 100 materials (random number), and each name 60 length, like
I think there is some confusion here: although building the entire tree is a possibility, it is not even suggested. A node is different from a Mesh or Texture, as explained in the GLTF specification. You can modify the nodes without needing to modify the resource, so caching assets in memory is not affected by modifying nodes. Also I don't see
Maybe there is a performance point here, you could gain a bit by not processing animations of hidden nodes or mesh-modified ones. But the animation which includes hidden/modified MeshRenderer should not affect the animation itself, just not applied. It's worth noting here that skinned meshes are a single node, and the bones are other nodes. So the only way to break it is to load a skinned mesh outside of a container (it doesn't make using Animator component in a different entity of GltfContainer)
This feature is a corollary of how the GLTF is exposed. The GLTF is a scene description, not a whole. It has Meshes, Textures and Materials. When you load a GLTF, it has dependencies, and these dependencies can be shared between multiple GLTF, how do you deal with that?
I think the implementation and support can be applied by stage, but each point is pretty linked from the other to split it. The full specification needs to be discussed as a whole, and then we can plan how to support it each client. Reference: |
What we need ensure from the protocol side is that decisions cannot be taken in isolation. Features like this while great and flexible from the creator side ultimately have long lasting impacts on all clients. What we would really love to see from the protocol side is that any feature introduced is both benchmarked in Memory, CPU and GPU and not in specific isolated cases. While the 100kb estimation seems fair, it doesn't include any of the marshaling or JavaScript runtimes that are required to actually translate this into a realistic scenario at runtime, nor does it include all of the other myriad of sub-systems that must be run in parallel to ensure all the protocol runs efficiently with many scenes active at once, avatars, UI... This should include but not be limited to CPU time & runtime memory allocations so we can at least understand fair limits of the said systems which will need to be introduced, we should evaluate the performance of a content creator continually changing the affected components and consider any limitations on a per component basis depending on the impact discovered. We have to build the protocol with the knowledge that all these newly introduced systems can and will be run in parallel. I would love it if the protocol squad would helps us identifying the in-efficiencies and scalability issues of the protocol itself and potential documentation on how this can work. A clear benchmark and understanding of all features new (and old) along side documentation which represents implementation details to ensure performance is vital as we want to future-proof the new client as much as possible. Thank you! |
I think documenting to such a degree is not likely to be helpful. Any benchmark will by nature be implementation dependent - one could imagine the node paths being a custom JS object which calls out to c++ where the strings are compressed, for an extreme example. In this case, the fact it is default off should be enough, and if an explorer chooses to limit the allowance for a scene that is up to them. This is the case with every resource. If the feature is heavily used (implying it is useFUL) and performance proves to be an issue then it can be addressed. Nothing anywhere prevents a scene from creating / calling out to an external service which could provide the same info. We can’t control JS memory usage explicitly, so the only answer is tools to manage it in total and kill/manage bad scenes appropriately. I also think putting implementations alongside the protocol would be a mistake. Performance can always be aggressively managed when required, and the protocol is not an implementation, it’s a data transfer abstraction, and one that is used in an intra-process context where the control for the client is very high (throttling, discarding intermediate LWW messages, etc). Blocking potentially useful features for the fear of performance efficiency is fundamentally the wrong approach I think. Where it makes sense, we can provide additional tools that can ease the worst of it (like tweens for transforms). Suggesting that something like an array of strings associated with a (likely megabytes large) GLTF asset should be restricted and scrutinized on the same level as transforms doesn't seem productive. |
Unfortunately, we can't iterate on the protocol to forbid certain features as when it is used by a scene it's set in stone and we don't have a fast process of deprecation. Normally, it's much safer and more straightforward to start with a smaller scope, and expand it if needed rather than cutting it on the go while it's already being used. What we are saying here is that we already anticipate problems in the proposed design with the reference client of Decentraland Foundation based on the proven history (4+ years) of development including the old explorer.
yes, and we've been investing lots of resources into ensuring the best UX possible, taking into consideration the current complexity and flexibility of the protocol that is already high. Providing the proposed design will lead to the further increase of those characteristics, we anticipate we will need to invest more, and the final UX will be worse (we will need to manage performance more aggressively by further throttling/skipping frames/budgeting). Providing it won't help UX and players consequently, I can't see how content creators can benefit from it. |
are you saying you can't pass strings to javascript without it being a performance bottleneck?
i would say that's up to the scene to manage. if it can be done performantly in some clients and not in others, the creator should tailor the scene appropriately - particularly if it allows experiences that can't be built otherwise. it's like targetting different browsers with different capabilities. and anyway, the impact should always be limited to the scene with the issue - i think you have that kind of process separation working well already in unity? i guess the key question for me is, are we trying to design a protocol that allows and exposes functionality, and then perhaps finding the explorer technology needs to catch up / risking potential performance issues in the short term for scenes that actually want to make use of that functionality, or are we designing a protocol that's deliberately limited to what unity can do without too much effort, so that creators can't use it to make badly performing scenes in the unity client as it is today? |
I think this debate on how to continue contributing to the protocol is necessary and productive. However, I would like to move forward in the development of this ADR and be clear about which points should remain and which should not (and WHY). I would particularly classify them into agree, to discuss, and no-go. As a guide, I left the list I think we can fill:
|
We’re not saying that just passing strings to JavaScript alone causes performance issues. The challenge comes from the combined load of multiple systems working together — scenes, avatars, dynamic assets, and more, all using memory, CPU, and GPU resources. In a shared world, it’s not just about the performance of one scene but about how everything together affects the overall experience, especially on less powerful hardware. We saw this before with the old explorer, where not having limits led to a poor experience for users with average devices. We want to avoid that happening again. Our main goal is to build an open, engaging environment that keeps good performance across different hardware as we add new features. This design approach isn’t meant to limit creativity; it’s to make sure the protocol works with our platform’s goals: creating a high-quality experience that runs well on a variety of devices. When we add a feature to the protocol, it sets a basic expectation across all clients, so we need to make sure it allows creators freedom while keeping the experience stable and smooth on a range of hardware. |
Going back to the specifics of Exposing GLTF internals subject, our initial approach would be simpler than the one proposed on this ADR, exactly:
We'll be working on an implementation with that minimal approach to experiment with the main functionality goals:
DIFFERENCES
SIMILARITIES
|
this will mean all the node transforms need to be updated by the explorer every frame if the gltf is moving. if the gltf node is a child of the container, it only needs to be updated when the node moves within the container, not if the gltf itself moves. is there a good reason to use a reference rather than the scene hierarchy? |
Let me try to get it straight then, comparing both approaches: Connecting
|
In our proposal, the transform of the node provided to the scene is relative to the node’s parent. So if the node is parented directly to the gltf container, its transform is relative to that container. If it is parented to another intermediate node (which must be a parent in the gltf hierarchy) the transform is relative to that parent node. To my mind this is natural and intuitive, and requires less “messaging maintenance” as well. |
Uhmm taking for example having this path/structure in a GLTF: "gltfroot/childA/childB/childC/childD" being each element a child of the previous one in the path. In the proto-squad proposal, if you want to get the In the foundation proposal you just create a I'll see if I can ask tomorrow the content team about both approaches and see what they would prefer and why and let you know. |
According to the ADR you can pick the node you want and the Transform received is relative to the You can make the whole tree and have the closest parent relative, or you can just parent it to the Having |
Another reason to have it relative to the GltfContainer/another GltfNode, I can imagine, is the ability to modify it in the Editor. Imagine you drag and drop a Gltf in the scene and want to move a building; you could expand the Instead, if it is a non-parent of the GltfContainer, I would feel it at least weird |
another benefit that occured to me: using the hierarchy is nicer if you |
Uhmm, so the creator can chose between those 2 possibilities? that wasn't clear for me in ADR, I thought the Now I'm starting to see some sense in having those 2 options and not be forced to create the whole hierarchy... Then technicallly we could support all the usages for the creators if we allowed:
I do agree that forcing creators to parent the Let me see how I can present this to the content team and get back to you. |
In the Specification part, you have these two points:
IMO, since the binding for parenting is already in the Transform (with the This is at the protocol level, if the concern is its usage in the SDK, it'd be more beneficial to add the Implementing the GltfNode as a child of any entity is even more complex than we proposed in the ADR or as you initially did. It turns it meaningless; if the parent (outside the GltfContainer) is moving, you'd need to attach the movement or keep updating the Transform for the GltfNode. As the component name indicates, the GltfContainer should contain its entities. External usage is another chapter, which we've already tackled by referencing the |
DiscussionAgain, I bring this item list to reference what we're discussing, so it's easier to find it:
GLTF (point 9)To address the concerns about data transfer and memory impact, I've analyzed all GLTF files currently deployed in Genesis City (46,527 files) to get concrete numbers about the actual string data that would be transferred when The data shows that the mean case is significantly smaller than initially assumed: Median case (50th percentile):
Even in the 90th percentile (worst 10% of cases):
DetailsGLTF Analysis Statistics
Node Path Statistics
Asset Name StatisticsMesh Names
Material Names
Skin Names
Animation Names
Regarding memory management:
Remember that this data transfer is:
For perspective, even in the 90th percentile case, we're talking about a few kilobytes of string data compared to potentially megabytes of mesh and texture data that the GLTF already contains. If anyone wants to review the methodology and raw data, the complete analysis with detailed statistics is available here. This data suggests that the string transfer overhead is negligible in the context of GLTF loading, especially given that it's an opt-in feature that creators will only use when needed. @aixaCode (point 9)
The impact of passing strings is what is being discussed here as a potential performance bottleneck. Of course all of the features we add coexist with the entire complexity of the explorer and the platform. We are aware of that. I've touched the old-explorer in several situations with profilers and I know the challenges.
We're totally aligned on this, and we're willing to do performance testing to make sure we're not impacting the experience. How can we set the bases and methodology to give this process more rigor? We've already made a potential implementation and made it available to the community to test and give feedback. @pravusjif (point 1)I've prepared the codebase to add the This |
Preview: https://adr-gltf-extension.adr-cvq.pages.dev/adr/ADR-253
Protocol modifications PR: decentraland/protocol#229
In our research, we've prepared example scenes and an entry in our book:
To test it, you need to follow first the section
Developing with Bevy explorer,
it's just one command in your scene, and thennpm start.
(instructions to rolling back the changes as well)Some demonstration of what it allows:
0804.mp4
Please, raise any question and challenge the ADR to build a solid specification.