Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite streaming API to use websocket #383

Merged
merged 29 commits into from
Dec 20, 2023
Merged

Conversation

PattaFeuFeu
Copy link
Collaborator

@PattaFeuFeu PattaFeuFeu commented Dec 9, 2023

Description

In this PR, I improve the streaming API:

  • Replace long-lived HTTP connections with WebSocket connections via okhttp
  • Implement missing stream types and request parameters
  • Update documentation

Closes #87.

Streaming URL

Some servers may use different streaming URLs than the regular instance URL. For that reason, I need to parse the Instance info during MastodonClient build.

I have seen #114 but decided not to go that route here yet. We should rewrite the MastodonClient builder as part of that ticket though!

Instead, if useStreamingApi is called as part of the builder, I retrieve the instance information—depending on API version—and get the streaming URL from it. For that, I did not reuse the Instance data classes for each version, nor the (fallback) code to retrieve the Instance info’s instance version:

private fun getInstanceVersion(): String {
return try {
getInstanceVersionViaServerInfo()
} catch (error: BigBoneClientInstantiationException) {
// fall back to retrieving from Mastodon API itself
try {
getInstanceVersionViaApi()
} catch (instanceException: InstanceVersionRetrievalException) {
throw BigBoneClientInstantiationException(
message = "Failed to get instance version of $instanceName",
cause = if (instanceException.cause == instanceException) {
instanceException.initCause(error)
} else {
instanceException
}

I opted for raw JSON parsing to hopefully keep the memory footprint a bit lower. I’m not sure if it impacts anything, though. I haven’t tested that.

There’s definitely room for improvement for the instance info retrieval based on API version. Since we will likely remove support for v1 (Mastodon below version 4) very soon anyway, I have not spent much more time to fix this though.

okhttp shortcomings

okhttp doesn’t like ws or wss schemes so we need to replace them with http and https. Otherwise, HttpUrl will immediately fail as soon as we supply the websocket version to the scheme portion of its builder. 👀

EventType polymorphism

The event payload we receive for each web socket message can look as follows:

https://docs.joinmastodon.org/methods/streaming/#events-11

{
  "stream": [
    "public"
  ],
  "event": "update",
  "payload": "{\"id\":\"108913983692647032\",\"created_at\":\"2022-08-30T21:38:22.000Z\",\"in_reply_to_id\":\"108913981098896721\",\"in_reply_to_account_id\":\"1081104\",\"sensitive\":false,\"spoiler_text\":\"\",\"visibility\":\"public\",\"language\":\"en\",\"uri\":\"https://fosstodon.org/users/tobtobxx/statuses/108913983628474640\",\"url\":\"https://fosstodon.org/@tobtobxx/108913983628474640\",\"replies_count\":0,\"reblogs_count\":0,\"favourites_count\":0,\"edited_at\":null,\"content\":\"<p>And now I can't exit the inner nvim because I mapped escape to the parent vim instance 😂</p>\",\"reblog\":null,\"account\":{\"id\":\"1081104\",\"username\":\"tobtobxx\",\"acct\":\"[email protected]\",\"display_name\":\"TobTobXX\",\"locked\":false,\"bot\":false,\"discoverable\":true,\"group\":false,\"created_at\":\"2020-01-10T00:00:00.000Z\",\"note\":\"<p>Young tech enthusiast. Likes software (and also general, just not work-) minimalsim. Constantly trying to escape big-tech software.<br>Other hobbies include making music, stargazing, math and recently chess, but there's a lot that piques my interest and a lot left to learn out there.</p><p>„Of course, every house is constructed by someone, but the one who constructed all things is God.“ (Hebrews 3:4 [nwt18])</p>\",\"url\":\"https://fosstodon.org/@tobtobxx\",\"avatar\":\"https://files.mastodon.social/cache/accounts/avatars/001/081/104/original/230a8d0fb54e249b.png\",\"avatar_static\":\"https://files.mastodon.social/cache/accounts/avatars/001/081/104/original/230a8d0fb54e249b.png\",\"header\":\"https://static-cdn.mastodon.social/headers/original/missing.png\",\"header_static\":\"https://static-cdn.mastodon.social/headers/original/missing.png\",\"followers_count\":150,\"following_count\":216,\"statuses_count\":2447,\"last_status_at\":\"2022-08-30\",\"emojis\":[],\"fields\":[{\"name\":\"📍 Lives in:\",\"value\":\"Switzerland (CET: UTC+1 or CEST: UTC+2)\",\"verified_at\":null},{\"name\":\"🔑 GPG  key:\",\"value\":\"EA23 42C5 3EBF 2A2D 985C  416A 12AC 3D47 52E2 FA2E\",\"verified_at\":null}]},\"media_attachments\":[],\"mentions\":[],\"tags\":[],\"emojis\":[],\"card\":null,\"poll\":null}"
}

event, which should always be present, holds the event type. Depending on the event type, payload may or may not be present and represents the data related to the event type. For example for delete, it would hold the ID of the status that had been deleted. For update, it would contain the JSON representation of a Status created.

I have tried to use kotlinx.serialization’s polymorphism feature but failed because I would have needed to unescape (we get an escaped payload that may contain JSON or just a plain string) it first. I did not find a way to get that working. 😞

For that reason, I introduced the RawStreamEvent class and the ParsedStreamEvent sealed interface with implementations for each event type. For each web socket message, before calling the callback, I call the RawStreamEvent.toStreamEvent extension function which then takes the string payload and parses it to any of the ParsedStreamEvent-implementing data objects/classes, depending on event type.

Type of Change

  • New feature
  • Documentation

Breaking Changes

With the replacement came loads of breaking changes. I’ve replaced the previous Handler and Shutdownable with a leaner callback and an extension of Closeable, so now the signatures, while similar, are actually quite different for callers.

It’s no longer necessary (or even possible) to call useStreamingApi when building the MastodonClient.

How Has This Been Tested?

I have tried to get okhttp’s mockwebserver and also fabric8io’s improved variant to run, but failed, unfortunately.

Initially, I had wanted to implement tests that would emulate a server serving a websocket connection so that I could test multiple different scenario.

As that didn’t work out, I implemented the tests using mockk like we do with all other endpoint tests.

Mandatory Checklist

  • I ran gradle check and there were no errors reported
  • I have performed a self-review of my code
  • I have added tests that prove my fix is effective or that my feature works
  • All tests pass locally with my changes
  • I have added KDoc documentation to all public methods

Optional Things To Check

The items below are some more things to check before asking other people to review your code.

  • In case you worked on a new feature: Did you also implement the reactive endpoint (bigbone-rx)?
  • Did you also update the documentation in the /docs folder (e.g. API Coverage page)?

@PattaFeuFeu PattaFeuFeu added documentation Improvements or additions to documentation or sample code enhancement New feature or request breaking Incompatible with previous versions labels Dec 9, 2023
@PattaFeuFeu PattaFeuFeu self-assigned this Dec 9, 2023
Copy link

codecov bot commented Dec 9, 2023

Codecov Report

Attention: 161 lines in your changes are missing coverage. Please review.

Comparison is base (7865f1b) 48.33% compared to head (eb70dda) 50.20%.

Additional details and impacted files
@@             Coverage Diff              @@
##             master     #383      +/-   ##
============================================
+ Coverage     48.33%   50.20%   +1.86%     
- Complexity      541      565      +24     
============================================
  Files           140      143       +3     
  Lines          3906     3928      +22     
  Branches        259      253       -6     
============================================
+ Hits           1888     1972      +84     
+ Misses         1825     1740      -85     
- Partials        193      216      +23     
Files Coverage Δ
.../main/kotlin/social/bigbone/api/entity/Reaction.kt 7.69% <ø> (ø)
.../social/bigbone/api/entity/streaming/StreamType.kt 100.00% <100.00%> (ø)
.../entity/streaming/StreamingAnnouncementReaction.kt 0.00% <0.00%> (ø)
...ial/bigbone/api/entity/streaming/WebSocketEvent.kt 22.22% <22.22%> (ø)
...ial/bigbone/api/entity/streaming/RawStreamEvent.kt 0.00% <0.00%> (ø)
...ain/kotlin/social/bigbone/rx/RxStreamingMethods.kt 42.50% <42.85%> (+42.50%) ⬆️
...tlin/social/bigbone/api/method/StreamingMethods.kt 61.53% <59.18%> (+61.53%) ⬆️
.../bigbone/api/entity/streaming/ParsedStreamEvent.kt 10.00% <10.00%> (ø)
...e/src/main/kotlin/social/bigbone/MastodonClient.kt 34.19% <22.82%> (-4.29%) ⬇️

@PattaFeuFeu PattaFeuFeu force-pushed the improve-streaming-apis branch 4 times, most recently from 23e7ecc to 0fb76ce Compare December 16, 2023 01:46
@PattaFeuFeu PattaFeuFeu force-pushed the improve-streaming-apis branch 3 times, most recently from 0525c33 to 9df9c23 Compare December 17, 2023 01:49
@PattaFeuFeu PattaFeuFeu marked this pull request as ready for review December 17, 2023 02:45
@PattaFeuFeu PattaFeuFeu changed the title Improve streaming APIs Rewrite streaming API to use websocket Dec 17, 2023
@PattaFeuFeu PattaFeuFeu force-pushed the improve-streaming-apis branch 2 times, most recently from fdafe91 to 5af79ca Compare December 17, 2023 13:34
@PattaFeuFeu PattaFeuFeu force-pushed the improve-streaming-apis branch from 5af79ca to 47eef95 Compare December 17, 2023 13:36
Copy link
Collaborator

@bocops bocops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great already, much better than before. I've added some comments, mostly regarding documentation.

@PattaFeuFeu
Copy link
Collaborator Author

@bocops In the stream method in MastodonClient, I had added a check if streamingUrl is null. If it is, I fall back to our usual URL schema containing the scheme, instanceName, and port plus the stream URL.

Should we nudge users toward using useStreamingApi? The way the Builder is set up currently, streamingUrl would only be null if useStreamingApi hadn’t been called.
If it was called, streamingUrl should not be null as we fall back to the usual instance-based URL anyway.

Should we throw an error if users call stream without having built the client with useStreamingApi?

@PattaFeuFeu PattaFeuFeu requested a review from bocops December 18, 2023 13:55
Copy link
Collaborator

@bocops bocops left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! :)

@PattaFeuFeu
Copy link
Collaborator Author

@bocops Any opinion on my comment above? 🤓

@bocops
Copy link
Collaborator

bocops commented Dec 19, 2023

I wonder if useStreamingApi() isn't just an unnecessary hoop we're making library users jump through? All it currently does is to increase the read timeout, but that would also affect all non-streaming calls performed by the same client, potentially decreasing performance.

I've not used the streaming API so far, so I don't know if increasing this value is strictly necessary. Do you?

If it is, we should keep a function like this, but probably(?) also advise users against using the same client to access streaming and non-streaming APIs? In this case, we could throw an error as you suggest.

If it is not necessary, we could also remove the function, but advise users to use setReadTimeoutSeconds() if necessary?

@PattaFeuFeu
Copy link
Collaborator Author

I wonder if useStreamingApi() isn't just an unnecessary hoop we're making library users jump through?

Probably, yes.

All it currently does is to increase the read timeout, but that would also affect all non-streaming calls performed by the same client, potentially decreasing performance.

Not entirely true: We also get the streaming URL from the Instance API on MastodonClient instantiation. If users do not call useStreamingApi, the stream method would only work for servers that use the same URL for websocket streaming as they do for all other API matters.

So if we were to remove the useStreamingApi method from the builder, the only downside would be that users that won’t use the streaming API would also have one more API call to get the streaming URL. Shouldn’t be that much of an issue, though.

I've not used the streaming API so far, so I don't know if increasing this value is strictly necessary. Do you?

I don’t know either, no. But I can find out. 👍

Ideally, I’d like to find a solution that works in all cases for all users. In the worst case, we could also use per-call settings that would differ from our usual settings, and apply them only to those stream calls.

@bocops
Copy link
Collaborator

bocops commented Dec 19, 2023

Sorry, I was quickly looking at the current state instead of your changes, and thus missed the URL retrieval. I understand that you don't want to tackle #114 here, but it might still be useful to think about how MastodonClient.Builder might eventually look.

If, in the long term, we do want to get instance information whenever a client is built, then the streaming URL will always be available without additional overhead, and there would again be less need for a useStreamingApi() call. In that case, it might be sensible to just always get that URL now and worry about optimization later, and not force users to use a call that might be gone soon.

Copy link
Owner

@andregasser andregasser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is a real improvement! 👍

- Gets the streaming URL for every MastodonClient instance
- Sets the pingInterval instead of a timeout, but only for stream calls
@PattaFeuFeu
Copy link
Collaborator Author

@andregasser @bocops Having read square/okhttp#3197 and square/okhttp#3227 and https://square.github.io/okhttp/3.x/okhttp/index.html?okhttp3/OkHttpClient.Builder.html#pingInterval-long-java.util.concurrent.TimeUnit-, I think we should be able to go without the read timeout increase and instead set the pingInterval. I’ve done that in commit 91ba533. Please check that one out. 😊

@andregasser
Copy link
Owner

I think we should be able to go without the read timeout increase and instead set the pingInterval. ... Please check that one out. 😊

I think this is a good way to go 👍 It's simpler and cleaner. Looking forward to do some manual testing with this.

@PattaFeuFeu PattaFeuFeu merged commit 1750db8 into master Dec 20, 2023
5 checks passed
@PattaFeuFeu PattaFeuFeu deleted the improve-streaming-apis branch December 20, 2023 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Incompatible with previous versions documentation Improvements or additions to documentation or sample code enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor Streaming API
3 participants