Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADR 253: Exposing GLTF internals #273

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Conversation

leanmendoza
Copy link
Contributor

@leanmendoza leanmendoza commented Nov 1, 2024

Preview: https://adr-gltf-extension.adr-cvq.pages.dev/adr/ADR-253
Protocol modifications PR: decentraland/protocol#229

In our research, we've prepared example scenes and an entry in our book:

Some demonstration of what it allows:

0804.mp4

Please, raise any question and challenge the ADR to build a solid specification.

@leanmendoza leanmendoza requested a review from a team as a code owner November 1, 2024 10:08
@leanmendoza leanmendoza marked this pull request as draft November 1, 2024 10:08
@leanmendoza leanmendoza removed the request for review from LautaroPetaccio November 1, 2024 10:08
Copy link

cloudflare-workers-and-pages bot commented Nov 1, 2024

Deploying adr with  Cloudflare Pages  Cloudflare Pages

Latest commit: c2ccb5a
Status: ✅  Deploy successful!
Preview URL: https://9d1eff12.adr-cvq.pages.dev
Branch Preview URL: https://adr-gltf-extension.adr-cvq.pages.dev

View logs

@leanmendoza leanmendoza changed the title wip ADR 253: Extend GLTF functionality in runtime 7 Nov 1, 2024
@leanmendoza leanmendoza marked this pull request as ready for review November 1, 2024 11:04
@leanmendoza leanmendoza changed the title ADR 253: Extend GLTF functionality in runtime 7 ADR 253: Exposing GLTF internals Nov 1, 2024
@mikhail-dcl
Copy link
Contributor

In Explorer Team we discussed it some time ago and revealed many concerns for the original proposals.

I will try to recap the majority of them:

Modify GltfLoadingState: New properties (animationNames, materialNames, meshNames, nodePaths) make it easier to access GLTF’s internal components. These properties let developers quickly reference animation clips, material names, mesh names, and node paths, simplifying animations, interactions, or modifications.

It's too heavy data to provide per se. Every possible GLTF can contain an unlimited number of such entries.

  • It's a lot of data to collect on the client side. It will lead to performance bottlenecks to traverse the whole hierarchy
  • Propagation of huge footprint will lead to uncontrollable allocations: it will be applied to any client with the managed "renderer" and the native "JS VM", including Explorer Alpha
    • Even with the current payload we already experience issues, propagating this data will make things worse
    • Client will need to store this data for CRDT - it's a huge memory footprint, otherwise avoidable
  • The structure of GLTF may be very complex, after propagation JS code itself may require much time to handle it
    • A fraction of the hierarchy might be really needed for a content creator

Add GltfNode and GltfNodeState Components: The new GltfNode and GltfNodeState components let developers map GLTF nodes to scene entities, allowing independent modification of a node’s Transform, Material, and MeshRenderer. These components stay in sync with GLTF animations, supporting dynamic transformations, event-based interactions, and resource re-use.

  • if we expose the whole hierarchy there is a vulnerability of modifying every single node (every frame) by iterating over them from the scene that will bloat marshalling as well as lead to uncontrollable performance and allocations.
  • Altering nodes themselves will lead to impossibility to re-use assets by caching in memory, e.g. when the model is shared, scene is loaded/unloaded currently we maintain a pool of objects to avoid expensive instantiation. It’s still possible to overcome but will lead to extra complexity to save and restore the initial state.
  • Maintaining a big number of entities may lead to performance deterioration per se (considering that we don't have any limitations for the GLTF complexity)

Modify MeshRenderer and MeshCollider: a gltf mesh reference added to these components to support mesh reuse across different entities, enabling interactive development with a single model.

  • Modification of MeshRenderer (including deletion of nodes) can easily lead to broken animations, it's not clear how to validate it

Using GLTF as a Mesh/Material Resource
This specification further allows the use of resources from a GLTF by setting the gltf field within Material, MeshRenderer, or MeshCollider. It is recommended to first load the GltfContainer and retrieve available meshes, materials, and textures, ensuring that these resources are loaded into memory before being used.

  • it leads to significant complications of Resources Management as "GLTF" is no longer a "whole" but should be broken down into multiple components
    • An individual GLTF can't be unloaded anymore (e.g. if the corresponding entity is destroyed) as its internals can be referenced by other entities
    • We anticipate additional challenges during implementation
    • Most likely Explorer Alpha won't be able to provide full support of it due to the ways we optimize the content

What we think is more reasonable to do as a first iteration:

  • Instead of exposing every single node let content creators reference nodes by the known <path/to/the/node>. Thus, at least amount of data being transferred is manually defined and restricted to the explicit needs of the content creators.
  • Apply extra-behavior additively
    • Instead of modifying nodes make it possible to access them in a read-only mode. Thus, it will be possible to attach other entities as a child, control Visibility, etc
    • This way resources management won't be broken
    • There will be no problems with skinning

@leanmendoza
Copy link
Contributor Author

leanmendoza commented Nov 1, 2024

It's too heavy data to provide per se. Every possible GLTF can contain an unlimited number of such entries.

* It's a lot of data to collect on the client side. It will lead to performance bottlenecks to traverse the whole hierarchy

* Propagation of huge footprint will lead to uncontrollable allocations: it will be applied to any client with the managed "renderer" and the native "JS VM", including Explorer Alpha
  
  * Even with the current payload we already experience issues, propagating this data will make things worse
  * Client will need to store this data for CRDT - it's a huge memory footprint, otherwise avoidable

* The structure of GLTF may be very complex, after propagation JS code itself may require much time to handle it
  
  * A fraction of the hierarchy might be really needed for a content creator

We can specify the internalFeedback as the default in false. This way, you ensure the creator only requires it when necessary. Not having an option to retrieve this data is a shot in the foot, repeating the same pattern we've done until today.

As much for .gltf as for .glb: nodes, meshes, textures, and materials structure is provided in JSON format, you don't need to know any hierarchy to provide this information; it would just be pipe the already available data. You don't need to recollect it.

About the uncontrollable allocations and huge memory footprint, I think it's far extremist to believe that about a bunch of strings. Let's say, 1000 nodes, 100 textures, 100 meshes, and 100 materials (random number), and each name 60 length, like abcdefg/hijklmn/opqrstuvw/xyzabcdef/ghijklmnop/qrstuvwxyz123, and rounding it to 64, you have less than a 100kb memory. Maybe the 100 textures of meshes are the real point to worry about here. I mean, can you provide some actual data to analyze where the issue is evident to see it?

Add GltfNode and GltfNodeState Components: The new GltfNode and GltfNodeState components let developers map GLTF nodes to scene entities, allowing independent modification of a node’s Transform, Material, and MeshRenderer. These components stay in sync with GLTF animations, supporting dynamic transformations, event-based interactions, and resource re-use.

* if we expose the whole hierarchy there is a vulnerability of modifying every single node (every frame) by iterating over them from the scene that will bloat marshalling as well as lead to uncontrollable performance and allocations.

* Altering nodes themselves will lead to impossibility to re-use assets by caching in memory, e.g. when the model is shared, scene is loaded/unloaded currently we maintain a pool of objects to avoid expensive instantiation. It’s still possible to overcome but will lead to extra complexity to save and restore the initial state.

* Maintaining a big number of entities may lead to performance deterioration per se (considering that we don't have any limitations for the GLTF complexity)

I think there is some confusion here: although building the entire tree is a possibility, it is not even suggested.
Once your GltfContainer is loaded, you can add a GltfNode to either attach more things or make granular modifications to it. You may want to modify not one but several, but I don't see the use case where you want to re-build the entire tree.

A node is different from a Mesh or Texture, as explained in the GLTF specification. You can modify the nodes without needing to modify the resource, so caching assets in memory is not affected by modifying nodes.

Also I don't see uncontrollable performance and allocations, you are pointing to existing resources and reusing them.

Modify MeshRenderer and MeshCollider: a gltf mesh reference added to these components to support mesh reuse across different entities, enabling interactive development with a single model.

* Modification of `MeshRenderer` (including deletion of nodes) can easily lead to broken animations, it's not clear how to validate it

Maybe there is a performance point here, you could gain a bit by not processing animations of hidden nodes or mesh-modified ones. But the animation which includes hidden/modified MeshRenderer should not affect the animation itself, just not applied.

It's worth noting here that skinned meshes are a single node, and the bones are other nodes. So the only way to break it is to load a skinned mesh outside of a container (it doesn't make using Animator component in a different entity of GltfContainer)

Using GLTF as a Mesh/Material Resource
This specification further allows the use of resources from a GLTF by setting the gltf field within Material, MeshRenderer, or MeshCollider. It is recommended to first load the GltfContainer and retrieve available meshes, materials, and textures, ensuring that these resources are loaded into memory before being used.

* it leads to significant complications of Resources Management as "GLTF" is no longer a "whole" but should be broken down into multiple components
  
  * An individual GLTF can't be unloaded anymore (e.g. if the corresponding entity is destroyed) as its internals can be referenced by other entities
  * We anticipate additional challenges during implementation
  * Most likely Explorer Alpha won't be able to provide full support of it due to the ways we optimize the content

This feature is a corollary of how the GLTF is exposed. The GLTF is a scene description, not a whole. It has Meshes, Textures and Materials. When you load a GLTF, it has dependencies, and these dependencies can be shared between multiple GLTF, how do you deal with that?

What we think is more reasonable to do as a first iteration:

* Instead of exposing every single node let content creators reference nodes by the known <path/to/the/node>. Thus, at least amount of data being transferred is manually defined and restricted to the explicit needs of the content creators.

* Apply extra-behavior additively
  
  * Instead of modifying nodes make it possible to access them in a read-only mode. Thus, it will be possible to attach other entities as a child, control `Visibility`, etc
  * This way resources management won't be broken
  * There will be no problems with skinning

I think the implementation and support can be applied by stage, but each point is pretty linked from the other to split it. The full specification needs to be discussed as a whole, and then we can plan how to support it each client.

Reference:

@m3taphysics
Copy link

About the uncontrollable allocations and huge memory footprint, I think it's far extremist to believe that about a bunch of strings. Let's say, 1000 nodes, 100 textures, 100 meshes, and 100 materials (random number), and each name 60 length, like abcdefg/hijklmn/opqrstuvw/xyzabcdef/ghijklmnop/qrstuvwxyz123, and rounding it to 64, you have less than a 100kb memory. Maybe the 100 textures of meshes are the real point to worry about here. I mean, can you provide some actual data to analyze where the issue is evident to see it?

What we need ensure from the protocol side is that decisions cannot be taken in isolation. Features like this while great and flexible from the creator side ultimately have long lasting impacts on all clients. What we would really love to see from the protocol side is that any feature introduced is both benchmarked in Memory, CPU and GPU and not in specific isolated cases. While the 100kb estimation seems fair, it doesn't include any of the marshaling or JavaScript runtimes that are required to actually translate this into a realistic scenario at runtime, nor does it include all of the other myriad of sub-systems that must be run in parallel to ensure all the protocol runs efficiently with many scenes active at once, avatars, UI...

This should include but not be limited to CPU time & runtime memory allocations so we can at least understand fair limits of the said systems which will need to be introduced, we should evaluate the performance of a content creator continually changing the affected components and consider any limitations on a per component basis depending on the impact discovered. We have to build the protocol with the knowledge that all these newly introduced systems can and will be run in parallel.

I would love it if the protocol squad would helps us identifying the in-efficiencies and scalability issues of the protocol itself and potential documentation on how this can work. A clear benchmark and understanding of all features new (and old) along side documentation which represents implementation details to ensure performance is vital as we want to future-proof the new client as much as possible.

Thank you!

@robtfm
Copy link

robtfm commented Nov 5, 2024

I think documenting to such a degree is not likely to be helpful. Any benchmark will by nature be implementation dependent - one could imagine the node paths being a custom JS object which calls out to c++ where the strings are compressed, for an extreme example.

In this case, the fact it is default off should be enough, and if an explorer chooses to limit the allowance for a scene that is up to them. This is the case with every resource. If the feature is heavily used (implying it is useFUL) and performance proves to be an issue then it can be addressed. Nothing anywhere prevents a scene from creating / calling out to an external service which could provide the same info. We can’t control JS memory usage explicitly, so the only answer is tools to manage it in total and kill/manage bad scenes appropriately.

I also think putting implementations alongside the protocol would be a mistake. Performance can always be aggressively managed when required, and the protocol is not an implementation, it’s a data transfer abstraction, and one that is used in an intra-process context where the control for the client is very high (throttling, discarding intermediate LWW messages, etc). Blocking potentially useful features for the fear of performance efficiency is fundamentally the wrong approach I think.

Where it makes sense, we can provide additional tools that can ease the worst of it (like tweens for transforms). Suggesting that something like an array of strings associated with a (likely megabytes large) GLTF asset should be restricted and scrutinized on the same level as transforms doesn't seem productive.

@mikhail-dcl
Copy link
Contributor

mikhail-dcl commented Nov 6, 2024

This is the case with every resource. If the feature is heavily used (implying it is useFUL) and performance proves to be an issue then it can be addressed

Unfortunately, we can't iterate on the protocol to forbid certain features as when it is used by a scene it's set in stone and we don't have a fast process of deprecation. Normally, it's much safer and more straightforward to start with a smaller scope, and expand it if needed rather than cutting it on the go while it's already being used.

What we are saying here is that we already anticipate problems in the proposed design with the reference client of Decentraland Foundation based on the proven history (4+ years) of development including the old explorer.

Performance can always be aggressively managed when required, and the protocol is not an implementation, it’s a data transfer abstraction, and one that is used in an intra-process context where the control for the client is very high (throttling, discarding intermediate LWW messages, etc).

yes, and we've been investing lots of resources into ensuring the best UX possible, taking into consideration the current complexity and flexibility of the protocol that is already high. Providing the proposed design will lead to the further increase of those characteristics, we anticipate we will need to invest more, and the final UX will be worse (we will need to manage performance more aggressively by further throttling/skipping frames/budgeting). Providing it won't help UX and players consequently, I can't see how content creators can benefit from it.

@leanmendoza leanmendoza closed this Nov 6, 2024
@leanmendoza leanmendoza reopened this Nov 6, 2024
@robtfm
Copy link

robtfm commented Nov 6, 2024

What we are saying here is that we already anticipate problems in the proposed design with the reference client of Decentraland Foundation based on the proven history (4+ years) of development including the old explorer.

are you saying you can't pass strings to javascript without it being a performance bottleneck?

... the final UX will be worse (we will need to manage performance more aggressively by further throttling/skipping frames/budgeting). Providing it won't help UX and players consequently, I can't see how content creators can benefit from it.

i would say that's up to the scene to manage. if it can be done performantly in some clients and not in others, the creator should tailor the scene appropriately - particularly if it allows experiences that can't be built otherwise. it's like targetting different browsers with different capabilities. and anyway, the impact should always be limited to the scene with the issue - i think you have that kind of process separation working well already in unity?

i guess the key question for me is, are we trying to design a protocol that allows and exposes functionality, and then perhaps finding the explorer technology needs to catch up / risking potential performance issues in the short term for scenes that actually want to make use of that functionality, or are we designing a protocol that's deliberately limited to what unity can do without too much effort, so that creators can't use it to make badly performing scenes in the unity client as it is today?

@leanmendoza
Copy link
Contributor Author

leanmendoza commented Nov 6, 2024

I think this debate on how to continue contributing to the protocol is necessary and productive.

However, I would like to move forward in the development of this ADR and be clear about which points should remain and which should not (and WHY). I would particularly classify them into agree, to discuss, and no-go.

As a guide, I left the list I think we can fill:

  1. Add GltfNode to be able to reference an internal node by parenting to a GltfContainer (or another GltfNode)
  2. Be able to add PointerEvents and Visibility in the GltfNode entity
  3. Be able to add more parented entities to this GltfNode (attach other entities)
  4. Expose MeshRenderer, MeshCollider and Material of the referenced node with a new gltf property in all cases
  5. Be able to modify MeshRenderer, MeshCollider and Material of the referenced node.
  6. Expose Transform relative to the GltfContainer (or GltfNode)
  7. Be able to modify Transform relative to the GltfContainer (either directly or with a Tween). If an internal animation forces a Transform, the modification is ignored.
  8. Be able to build MeshRenderer, MeshCollider and Material referencing internal resources.
  9. Expose nodes, meshes, materials, textures and animations in GltfLoadingState only when specified (explicitly internalFeedback=true).

@aixaCode
Copy link
Contributor

aixaCode commented Nov 6, 2024

are you saying you can't pass strings to javascript without it being a performance bottleneck?

We’re not saying that just passing strings to JavaScript alone causes performance issues. The challenge comes from the combined load of multiple systems working together — scenes, avatars, dynamic assets, and more, all using memory, CPU, and GPU resources. In a shared world, it’s not just about the performance of one scene but about how everything together affects the overall experience, especially on less powerful hardware. We saw this before with the old explorer, where not having limits led to a poor experience for users with average devices. We want to avoid that happening again.

Our main goal is to build an open, engaging environment that keeps good performance across different hardware as we add new features. This design approach isn’t meant to limit creativity; it’s to make sure the protocol works with our platform’s goals: creating a high-quality experience that runs well on a variety of devices. When we add a feature to the protocol, it sets a basic expectation across all clients, so we need to make sure it allows creators freedom while keeping the experience stable and smooth on a range of hardware.

@pravusjif
Copy link
Member

Going back to the specifics of Exposing GLTF internals subject, our initial approach would be simpler than the one proposed on this ADR, exactly:

* (new) GltfNode
    -> gltfEntity
    -> path
* (new) GltfNodeState
    -> state
    -> error
* MeshCollider
    -> (new) GltfMesh
* MeshRenderer
    -> (new) GltfMesh

We'll be working on an implementation with that minimal approach to experiment with the main functionality goals:

  • Being able to enparent entities to GLTF nodes
  • Being able to manipulate components on the GltfNode entities themselves (may not be working on 1st iteration)

DIFFERENCES

  • We propose to have the target Gltf in which to look for the GltfNode as an entity reference in the GltfNode component (instead of having the GltfNode entity as a child of the GltfContainer entity to be able to look into that one)
  • We won’t be implementing anything related to reading the GLTF structure and its resources and communicating that to the scene (no GltfContainerLoadingState new resources properties, nor allowing the Gltf resources to be used on other entities)
  • We won’t be creating a new type of Material for GLTFs, we will be populating the entity mapping the Gltf loaded material props into the SDKs existent Material component
  • We will be introducing the GltfMesh new property for MeshRenderer and MeshCollider components but won’t be including properties to reference a mesh from another Gltf

SIMILARITIES

  • Addition of GltfNode and GltfNodeState components to map sdk created entities to GLTFs sub nodes.
  • The Explorer will recognize certain Gltf properties and map them to SDK components (Transform, Material, MeshRenderer, etc.) that the scene will be able to read
  • Allow using a GltfNode entity to manipulate its Transform, Material and other components as well as adding new ones (technically this can be iteration 2 but I’d like to try a solution on 1st iteration).

@robtfm
Copy link

robtfm commented Nov 6, 2024

the GltfNode as an entity reference

this will mean all the node transforms need to be updated by the explorer every frame if the gltf is moving. if the gltf node is a child of the container, it only needs to be updated when the node moves within the container, not if the gltf itself moves. is there a good reason to use a reference rather than the scene hierarchy?

@pravusjif
Copy link
Member

this will mean all the node transforms need to be updated by the explorer every frame if the gltf is moving

is there a good reason to use a reference rather than the scene hierarchy?

Let me try to get it straight then, comparing both approaches:

Connecting GltfNode to the GltfContainer entity it belongs through parenting (proto-squad proposal):

  • The values on the GltfNode entity Transform will be local (relative to its nearest node parent)
  • The Transform wouldn't be modified if that node is not moved inside the GLTFs structure, relative to the GLTF itself and not to the global world position of the GLTF
  • This approach offers the automatic "local transform" value to the scene, and if the creators want to get the "global transform" value they have to manually calculate it using the node entity transform values + the GLTF transform values (+ probably the transform value of every gltf node in between, since there could be more in there each with their own relative transform value to their own parent; Unless the Explorer always updates the GltfNodeentity transform values relative to the root node and not the real node parents, not sure how the proto-squad is handling that)

Connecting GltfNode to the GltfContainer entity it belongs through entity reference (foundation proposal):

  • The values on the GltfNode entity Transform will be global as a parentless entity (unless it's later parented to another entity), same as if it were a separate GltfContainer entity.
  • The Transform of the GltfEntity would get updated if the node entity is moved and also if the GLTF it belongs to is moved
  • This approach offers the automatic "global transform" value to the scene, and if creators want to get the "local transform" value they would have to calculate it either relative to the GltfContainer entity transform or to their immediate parent gltfnode (would need a different GltfNode mapped to the parent node)

I still feel more inclined towards the entity reference approach, what do the rest think?

I'm thinking maybe manipulating the scale particularily can be a problem but it's something I'll have to test.

@robtfm
Copy link

robtfm commented Nov 6, 2024

In our proposal, the transform of the node provided to the scene is relative to the node’s parent.

So if the node is parented directly to the gltf container, its transform is relative to that container.

If it is parented to another intermediate node (which must be a parent in the gltf hierarchy) the transform is relative to that parent node.

To my mind this is natural and intuitive, and requires less “messaging maintenance” as well.

@pravusjif
Copy link
Member

Uhmm taking for example having this path/structure in a GLTF: "gltfroot/childA/childB/childC/childD" being each element a child of the previous one in the path.

In the proto-squad proposal, if you want to get the GltfNode for the childD you have to create a GltfNode for each parent in the middle (one for childA, another for childB another for childC and the last one for childD).

In the foundation proposal you just create a GltfNode with the "gltfroot/childA/childB/childC/childD" path and that's it.

I'll see if I can ask tomorrow the content team about both approaches and see what they would prefer and why and let you know.

@leanmendoza
Copy link
Contributor Author

Uhmm taking for example having this path/structure in a GLTF: "gltfroot/childA/childB/childC/childD" being each element a child of the previous one in the path.

In the proto-squad proposal, if you want to get the GltfNode for the childD you have to create a GltfNode for each parent in the middle (one for childA, another for childB another for childC and the last one for childD).

In the foundation proposal you just create a GltfNode with the "gltfroot/childA/childB/childC/childD" path and that's it.

I'll see if I can ask tomorrow the content team about both approaches and see what they would prefer and why and let you know.

According to the ADR you can pick the node you want and the Transform received is relative to the parent one.

You can make the whole tree and have the closest parent relative, or you can just parent it to the GltfContainer and make the transform relative to the root scene. The GLTF is a scene container, I think it makes sense to keep the consistency for future approaches.

Having global or local Transform is an SDK implementation issue. A globalPosition accessor is totally possible without needing the utils library; it's just not been prioritized.

@leanmendoza
Copy link
Contributor Author

Another reason to have it relative to the GltfContainer/another GltfNode, I can imagine, is the ability to modify it in the Editor.

Imagine you drag and drop a Gltf in the scene and want to move a building; you could expand the GltfContainer and select the building and move it (the state would only change the modifications of the modified node; it's not necessary to re-build the whole tree).

Instead, if it is a non-parent of the GltfContainer, I would feel it at least weird

@robtfm
Copy link

robtfm commented Nov 7, 2024

another benefit that occured to me: using the hierarchy is nicer if you engine.removeEntityWithChildren(container) - then all your spawned nodes will be cleaned up automatically. otherwise you'd need to iterate the withComponent(GltfNode) and despawn the ones matching the despawned container manually (or leave them dangling).

@pravusjif
Copy link
Member

You can make the whole tree and have the closest parent relative, or you can just parent it to the GltfContainer and make the transform relative to the root scene.

Uhmm, so the creator can chose between those 2 possibilities? that wasn't clear for me in ADR, I thought the GltfNode that was created as a child of the GltfContainer entity had to be the 1st real node child in the path, but then from what you say I could create a GltfNode with the path "gltfroot/childA/childB/childC/childD" and set its Transform as a child of the GltfContainer entity and then my GltfNode entity Transform values would get updated to be relative to the root node even if there are several nodes between my target and the root node, right? so I wouldn't be forced to set up a GltfNode entity for "gltfroot/childA", "gltfroot/childA/childB" and "gltfroot/childA/childB/childC".

Now I'm starting to see some sense in having those 2 options and not be forced to create the whole hierarchy...

Then technicallly we could support all the usages for the creators if we allowed:

  • Create a GltfNode entity with the node path and the target gltf entity reference props (foundation proposed structure of the GltfNode component)
  • If the GltfNode entity Transform has any parent at any point, its Transform values would get updated relative to the parent as any other entity (regardless of the parent being the GltfContainer entity, another GltfNode entity or any entity actually), otherwas if it's prentless, it behaves as a normal entity without parent with global/scene-relative values.

I do agree that forcing creators to parent the GltfNode entity to either a GltfContainer or a GltfNode would make the engine.removeEntityWithChildren(container) remove automatically the child entities with GltfNode. However if the enforcing of the parenting is for that, it's also affecting the usage design of the feature to facilitate the implementation on the explorer-side. Technically every time a GltfNode component is configured on an entity, in the Explorer side when the target Gltf is looked up for grabbing the node, at that point the GltfNode entity can be saved/referenced to be automatically cleared or reset or whatever when the actual GltfContainer is removed.

Let me see how I can present this to the content team and get back to you.

@leanmendoza
Copy link
Contributor Author

Uhmm, so the creator can chose between those 2 possibilities? that wasn't clear for me in ADR, I thought the GltfNode that was created as a child of the GltfContainer entity had to be the 1st real node child in the path, but then from what you say I could create a GltfNode with the path "gltfroot/childA/childB/childC/childD" and set its Transform as a child of the GltfContainer entity and then my GltfNode entity Transform values would get updated to be relative to the root node even if there are several nodes between my target and the root node, right? so I wouldn't be forced to set up a GltfNode entity for "gltfroot/childA", "gltfroot/childA/childB" and "gltfroot/childA/childB/childC".

Now I'm starting to see some sense in having those 2 options and not be forced to create the whole hierarchy...

Then technicallly we could support all the usages for the creators if we allowed:

* Create a `GltfNode` entity with the **node path** and the **target gltf entity reference** props (foundation proposed structure of the `GltfNode` component)

* If the `GltfNode` entity `Transform` has any parent at any point, its `Transform` values would get updated relative to the parent as any other entity (regardless of the parent being the `GltfContainer` entity, another `GltfNode` entity or any entity actually), otherwas if it's prentless, it behaves as a normal entity without parent with global/scene-relative values.

I do agree that forcing creators to parent the GltfNode entity to either a GltfContainer or a GltfNode would make the engine.removeEntityWithChildren(container) remove automatically the child entities with GltfNode. However if the enforcing of the parenting is for that, it's also affecting the usage design of the feature to facilitate the implementation on the explorer-side. Technically every time a GltfNode component is configured on an entity, in the Explorer side when the target Gltf is looked up for grabbing the node, at that point the GltfNode entity can be saved/referenced to be automatically cleared or reset or whatever when the actual GltfContainer is removed.

Let me see how I can present this to the content team and get back to you.

In the Specification part, you have these two points:

  1. A GltfNode component MUST be added to an entity that is either a direct child of a GltfContainer entity or another entity with a GltfNode.
  2. Transform: MUST match the position of the GLTF node relative to its parent.

IMO, since the binding for parenting is already in the Transform (with the parent field), having one more type to reference the link is wasteful.

This is at the protocol level, if the concern is its usage in the SDK, it'd be more beneficial to add the globalTransform property with a proper getter and setter, and it would work for every entity (not only for GltfNode). We already have a patched Transform component there and it's a question of prioritizing code features.

Implementing the GltfNode as a child of any entity is even more complex than we proposed in the ADR or as you initially did. It turns it meaningless; if the parent (outside the GltfContainer) is moving, you'd need to attach the movement or keep updating the Transform for the GltfNode.

As the component name indicates, the GltfContainer should contain its entities. External usage is another chapter, which we've already tackled by referencing the gltfSrc to reuse resources provided by the Gltf (like meshes, textures, materials, etc.).

@leanmendoza
Copy link
Contributor Author

Discussion

Again, I bring this item list to reference what we're discussing, so it's easier to find it:

  1. Add GltfNode to be able to reference an internal node by parenting to a GltfContainer (or another GltfNode)
  2. Be able to add PointerEvents and Visibility in the GltfNode entity
  3. Be able to add more parented entities to this GltfNode (attach other entities)
  4. Expose MeshRenderer, MeshCollider and Material of the referenced node with a new gltf property in all cases
  5. Be able to modify MeshRenderer, MeshCollider and Material of the referenced node.
  6. Expose Transform relative to the GltfContainer (or GltfNode)
  7. Be able to modify Transform relative to the GltfContainer (either directly or with a Tween). If an internal animation forces a Transform, the modification is ignored.
  8. Be able to build MeshRenderer, MeshCollider and Material referencing internal resources.
  9. Expose nodes, meshes, materials, textures and animations in GltfLoadingState only when specified (explicitly internalFeedback=true).

GLTF (point 9)

To address the concerns about data transfer and memory impact, I've analyzed all GLTF files currently deployed in Genesis City (46,527 files) to get concrete numbers about the actual string data that would be transferred when internalFeedback=true.

The data shows that the mean case is significantly smaller than initially assumed:

Median case (50th percentile):

  • Only 2 nodes with average path length of 32 characters
  • 2 meshes with ~9 character names
  • 1 material with ~12 character name
  • No skins or animations in most files

Even in the 90th percentile (worst 10% of cases):

  • 46 nodes
  • 8 meshes
  • 6 materials
  • 1 skin
  • 1 animation
Details

GLTF Analysis Statistics

  • Date: November 11th, 2024
  • GLTF count: 46527 files (48 with errors and dismissed < 0.1%)
  • Places: all the scenes deployed in Genesis City

Node Path Statistics

  • Length Statistics

    • Mean: 67.46 characters
    • Median: 32 characters
    • Standard Deviation: 73.52
    • Percentiles:
      • 20th: 13 characters
      • 50th: 32 characters
      • 80th: 135 characters
      • 90th: 194 characters
    • Segment Means:
      • Top 10%: 231.02 characters
      • Middle 80%: 54.57 characters
  • Count Statistics

    • Mean: 19.45 nodes
    • Median: 2 nodes
    • Standard Deviation: 118.39
    • Percentiles:
      • 20th: 1 node
      • 50th: 2 nodes
      • 80th: 9 nodes
      • 90th: 46 nodes
    • Segment Means:
      • Top 10%: 154.69 nodes
      • Middle 80%: 4.85 nodes

Asset Name Statistics

Mesh Names

  • Length Statistics

    • Mean: 12.04 characters
    • Median: 9 characters
    • Standard Deviation: 9.37
    • Percentiles: 8/9/12/20 (20th/50th/80th/90th)
    • Segment Means:
      • Top 10%: 35.79 characters
      • Middle 80%: 9.85 characters
  • Count Statistics

    • Mean: 8.08 meshes
    • Median: 2 meshes
    • Standard Deviation: 49.34
    • Percentiles: 1/2/3/8 (20th/50th/80th/90th)
    • Segment Means:
      • Top 10%: 64.67 meshes
      • Middle 80%: 1.90 meshes

Material Names

  • Length Statistics

    • Mean: 12.49 characters
    • Median: 12 characters
    • Standard Deviation: 6.02
    • Percentiles: 8/12/16/20 (20th/50th/80th/90th)
    • Segment Means:
      • Top 10%: 25.23 characters
      • Middle 80%: 11.92 characters
  • Count Statistics

    • Mean: 3.11 materials
    • Median: 1 material
    • Standard Deviation: 8.15
    • Percentiles: 1/1/3/6 (20th/50th/80th/90th)
    • Segment Means:
      • Top 10%: 15.98 materials
      • Middle 80%: 1.80 materials

Skin Names

  • Length Statistics

    • Mean: 11.72 characters
    • Median: 11 characters
    • Standard Deviation: 7.29
    • Percentiles: 8/11/12/15 (20th/50th/80th/90th)
    • Segment Means:
      • Top 10%: 28.97 characters
      • Middle 80%: 10.23 characters
  • Count Statistics

    • Mean: 0.17 skins
    • Median: 0 skins
    • Standard Deviation: 1.86
    • Percentiles: 0/0/0/1 (20th/50th/80th/90th)
    • Segment Means:
      • Top 10%: 1.63 skins
      • Middle 80%: 0.01 skins

Animation Names

  • Length Statistics

    • Mean: 12.72 characters
    • Median: 10 characters
    • Standard Deviation: 9.29
    • Percentiles: 5/10/18/24 (20th/50th/80th/90th)
    • Segment Means:
      • Top 10%: 33.80 characters
      • Middle 80%: 11.27 characters
  • Count Statistics

    • Mean: 0.78 animations
    • Median: 0 animations
    • Standard Deviation: 6.88
    • Percentiles: 0/0/1/1 (20th/50th/80th/90th)
    • Segment Means:
      • Top 10%: 6.52 animations
      • Middle 80%: 0.16 animations

Regarding memory management:

  • The string transfer is a simple serialization operation (copy strings from GLTF to scene)
  • A pre-allocated buffer per scene based for component serialization would mitigate runtime allocations (I think you've already got one)
  • Given the median case uses ~100 bytes and even the 90th percentile would be just a few KB, the buffer overhead per scene would be minimal

Remember that this data transfer is:

  1. Optional (default internalFeedback=false)
  2. One-time only (during loading)
  3. Already present in the GLTF JSON structure
  4. Small compared to the actual mesh/texture data
  5. Can be optimized with pre-allocated buffers

For perspective, even in the 90th percentile case, we're talking about a few kilobytes of string data compared to potentially megabytes of mesh and texture data that the GLTF already contains.

If anyone wants to review the methodology and raw data, the complete analysis with detailed statistics is available here.

This data suggests that the string transfer overhead is negligible in the context of GLTF loading, especially given that it's an opt-in feature that creators will only use when needed.

@aixaCode (point 9)

are you saying you can't pass strings to javascript without it being a performance bottleneck?

We’re not saying that just passing strings to JavaScript alone causes performance issues. The challenge comes from the combined load of multiple systems working together — scenes, avatars, dynamic assets, and more, all using memory, CPU, and GPU resources. In a shared world, it’s not just about the performance of one scene but about how everything together affects the overall experience, especially on less powerful hardware. We saw this before with the old explorer, where not having limits led to a poor experience for users with average devices. We want to avoid that happening again.

The impact of passing strings is what is being discussed here as a potential performance bottleneck. Of course all of the features we add coexist with the entire complexity of the explorer and the platform. We are aware of that. I've touched the old-explorer in several situations with profilers and I know the challenges.
In the Godot Explorer, we're trying to maximize the users' experience and are always looking for ways to improve performance. We have a real limitation as we're targeting mostly low-end devices like mobile as the entry door.

Our main goal is to build an open, engaging environment that keeps good performance across different hardware as we add new features. This design approach isn’t meant to limit creativity; it’s to make sure the protocol works with our platform’s goals: creating a high-quality experience that runs well on a variety of devices. When we add a feature to the protocol, it sets a basic expectation across all clients, so we need to make sure it allows creators freedom while keeping the experience stable and smooth on a range of hardware.

We're totally aligned on this, and we're willing to do performance testing to make sure we're not impacting the experience. How can we set the bases and methodology to give this process more rigor? We've already made a potential implementation and made it available to the community to test and give feedback.

@pravusjif (point 1)

I've prepared the codebase to add the globalPosition getter and setter to the Transform here: decentraland/js-sdk-toolchain#1032. Also, as performance is something we have to monitor continuously, I've pushed back this change to have at least some data from the impact of adding new things to the SDK: decentraland/js-sdk-toolchain#1033.

This globalPosition is a feature currently incompletely addressed in the utils library. It's already a challenge because of the dissociation between real Transform and the data we have in the scene (the networking entities, avatar attach, or Billboard). Using an external component unparented of the original GltfContainer adds even more dissociation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants