Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC4248: Pull-based presence #4248

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

nexy7574
Copy link

@turt2live turt2live added proposal A matrix spec change proposal client-server Client-Server API kind:maintenance MSC which clarifies/updates existing spec needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. labels Dec 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we could instead have servers signal that they don't want presence updates (for those that turn it off), as well as not sending presence to servers we haven't recently interacted with (ie. we dont have a message in the last 50 messages in a room's timeline).

I worry that making it pull based would ruin performance as you'll be dealing with large incomming (continuous) request volume rather than the spurious outgoing burst.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For synapse, it might be worthwhile to limit outgoing presence to the result of select * from destinations where failure_ts is null;? (AKA servers we know are online)

Copy link
Author

@nexy7574 nexy7574 Dec 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this model, servers can respond with 403 to indicate that they do not federate their presence, and remote servers should not request it again (at least, for a very long time). This voids the need for indicating in the first place.

As for the performance, the requests would not be continuous. Servers would configure how often they request presence, how long it's cached locally for, etc etc, and as such would distribute the requests over time.
That "spurious outgoing burst" is more like a constant flat line for lower-end servers (often single-user or cloud-based) since they will be continuously sending presence updates to potentially tens of thousands of other servers, most of which will not be interested in the slightest. This is, as is noted, a waste of bandwidth, cpu, and other resources, meaning it's usually futile for lower-end/smaller servers to enable it, and just a waste of resources for higher end/larger servers.

At least with a pull-based model, the ability to bulk-fetch presence would be much ligher on the origin server than constantly hammering out new EDUs, especially when the homeserver can return down to an empty object when there's been no presence changes.

I'm yet to have a chance to means test anything similar to this, but I know that my servers can handle hundreds of thousands of inbound federation requests per minute just fine, I'm sure a few thousand extra presence requests would be of no harm (compared to the literally devastating effects of sending out thousands of EDUs instead)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For synapse, it might be worthwhile to limit outgoing presence to the result of select * from destinations where failure_ts is null;? (AKA servers we know are online)

This could indeed be an optimisation, but then what about when dead servers come back? They have then missed out on the previous presence by nature of EDUs. Pull-based presence would mean they can request it when they come back and have the most up-to-date presence immediately, rather than needing to wait for the next presence update.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about dead servers commig back
Wouldn't this be solved by them being marked as online when they process a new PDU?

Additionally, load distribution wise, presence doesn't need to be sent immediately, you could have a background task that does a small amount of concurrent pushing (ie. try 10 servers at a time), instead of trying to send it to all servers all at once?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "small background task" doesn't scale in any sort of desirable way here, and if we're scheduling outgoing sending, what's the point of even having the EDU anyway? because at that point, there's even less sense of urgency regarding keeping presence up to date.

@nexy7574 nexy7574 marked this pull request as ready for review December 28, 2024 00:53
@nexy7574 nexy7574 marked this pull request as draft December 30, 2024 14:02
@nexy7574
Copy link
Author

Heard some very good arguments for both EDU and pull presence so I'm going to rework this MSC so that EDU presence compliments pull-based presence (or vice versa, whichever floats your boat) rather than pbp replacing EDUs.

This alternative has a lot of privacy implications and would only further exacerbate the frequent complaint regarding membership updates in the timeline (i.e. "Why are avatar changes always sent to every room individually")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client-server Client-Server API kind:maintenance MSC which clarifies/updates existing spec needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. proposal A matrix spec change proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants