Skip to content
This repository has been archived by the owner on Jan 13, 2025. It is now read-only.

_handleServerNotification on Connection.ts has an unhandled error. #3053

Open
jpablogn opened this issue Aug 2, 2024 · 4 comments
Open

_handleServerNotification on Connection.ts has an unhandled error. #3053

jpablogn opened this issue Aug 2, 2024 · 4 comments
Labels
1.x Pertains to the v1.x line on the maintenance/v1.x branch bug Something isn't working

Comments

@jpablogn
Copy link

jpablogn commented Aug 2, 2024

My app uses connect.onAccountChange() and connect.removeAccountChangeListener() and is keeps open 100-200 subscritions.

Main libraries:

  • @solana/web3.js ^1.91.7
  • rpc-websockets ^7.5.1

The app crashes when it tries to use subscribe/unsubscribe using onAccountChange()/removeAccountChangeListener() and the Websocket is not with readyState===1.

Fail when trying to subscribe (it goes in an infinite loop with all the accounts handled at that moment):

accountSubscribe error for argument [
  'F3TuARNBKV6U7uA4DvLZUUgmcAAEtpDeMyC36Zb1CPnR',
  { encoding: 'base64', commitment: 'confirmed' }
] Tried to call a JSON-RPC method `accountSubscribe` but the socket was not `CONNECTING` or `OPEN` (`readyState` was 3)

Fail when trying to unsubscribe (it goes in an infinite loop with all the accounts handled at that moment):

accountUnsubscribe error: Tried to call a JSON-RPC method `accountUnsubscribe` but the socket was not `CONNECTING` or `OPEN` (`readyState` was 2)

And finally crashes:

<--- Last few GCs --->

[1996:0000021E09A48EE0] 19434569 ms: Mark-Compact 4047.4 (4134.3) -> 4032.6 (4135.3) MB, 1414.30 / 0.00 ms  (average mu = 0.253, current mu = 0.245) allocation failure; scavenge might not succeed
[1996:0000021E09A48EE0] 19436455 ms: Mark-Compact 4048.4 (4135.3) -> 4033.8 (4136.6) MB, 1735.41 / 0.00 ms  (average mu = 0.169, current mu = 0.080) allocation failure; scavenge might not succeed


<--- JS stacktrace --->

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
----- Native stack trace -----

 1: 00007FF6430BC81B node::SetCppgcReference+17979
 2: 00007FF643026674 DSA_meth_get_flags+89316
 3: 00007FF643AA4871 v8::Isolate::ReportExternalAllocationLimitReached+65
 4: 00007FF643A8DFC8 v8::Function::Experimental_IsNopFunction+1336
 5: 00007FF6438EFA70 v8::Platform::SystemClockTimeMillis+659328
 6: 00007FF6438FBCF3 v8::Platform::SystemClockTimeMillis+709123
 7: 00007FF6438F9654 v8::Platform::SystemClockTimeMillis+699236
 8: 00007FF6438EC790 v8::Platform::SystemClockTimeMillis+646304
 9: 00007FF643901E0A v8::Platform::SystemClockTimeMillis+733978
10: 00007FF643902687 v8::Platform::SystemClockTimeMillis+736151
11: 00007FF643910F7F v8::Platform::SystemClockTimeMillis+795791
12: 00007FF6435D1CA5 v8::CodeEvent::GetFunctionName+116773

As far as WebSocket.readyState is not visible, I have tried to catch the error in different ways but I can't catch it.

Option 1: try-catch

const removeListener = async (id:number) =>{

    let ok:boolean = false;
    let n_times:number = 1;

    do{

        try{
            await connection.removeAccountChangeListener(id);
            ok = true;
        }catch(error){
            ok = false;
        }            

        if (!ok){
            await delay(1000); // custom function to delay 1s
            console.log(`Try: ${n_times}, Nop: ${id}`);
            n_times++;
        }else{
            console.log(`Try: ${n_times}, Disconnected: ${id}`);
        }

    }while(!ok);

}

Option 2: catch chained

const removeListener = async (id:number) =>{

    let ok:boolean = false;
    let n_times:number = 1;

    do{
        ok = true;
        await connection.removeAccountChangeListener(id).catch(error){
            ok = false;
        }            

        if (!ok){
            await delay(1000); // custom function to delay 1s
            console.log(`Try: ${n_times}, Nop: ${id}`);
            n_times++;
        }else{
            console.log(`Try: ${n_times}, Disconnected: ${id}`);
        }

    }while(!ok);

}

These errors are in file "rpc-websocket.ts" call and notify reject funcions:

      new Error(
        'Tried to call a JSON-RPC method `' +
          args[0] +
          '` but the socket was not `CONNECTING` or `OPEN` (`readyState` was ' +
          readyState +
          ')',
      ),

But I think the problem could came from 'connection.ts' which has this piece of code with a no controlled catch, because I can't catch the error:

  /**
   * @internal
   */
  private _handleServerNotification<
    TCallback extends SubscriptionConfig['callback'],
  >(
    serverSubscriptionId: ServerSubscriptionId,
    callbackArgs: Parameters<TCallback>,
  ): void {
    const callbacks =
      this._subscriptionCallbacksByServerSubscriptionId[serverSubscriptionId];
    if (callbacks === undefined) {
      return;
    }
    callbacks.forEach(cb => {
      try {
        cb(
          // I failed to find a way to convince TypeScript that `cb` is of type
          // `TCallback` which is certainly compatible with `Parameters<TCallback>`.
          // See https://github.com/microsoft/TypeScript/issues/47615
          // @ts-ignore
          ...callbackArgs,
        );
      } catch (e) {
        console.error(e);
      }
    });
  }

Few questions open en Solana Stack Exchange:
https://solana.stackexchange.com/questions/15643/accountunsubscribe-and-accountsubscribe-error-tried-to-call-a-json-rpc-method-w
https://solana.stackexchange.com/questions/15698/solana-web3-js-it-is-not-possible-to-detect-websocket-error

Could it be a bug?
How can I handle or catch this error?
How can I check websocket.readyStatus to avoid subscribe/unsubscribe when it is not in OPEN or CONNECTING?

@jpablogn jpablogn added the bug Something isn't working label Aug 2, 2024
@solana-labs solana-labs deleted a comment Aug 6, 2024
@steveluscher
Copy link
Contributor

Subscribe errors are caught here: https://github.com/solana-labs/solana-web3.js/blob/maintenance/v1.x/src/connection.ts#L6255-L6270

Unsubscribe errors are caught here: https://github.com/solana-labs/solana-web3.js/blob/maintenance/v1.x/src/connection.ts#L6310-L6322

In neither place are these errors surfaced to your application.

It's possible that a check for readyState could be added to isCurrentConnectionStillActive(). Could you try that, and send a PR if it works?

All things considered, this subscriptions API and implementation are terminally flawed. We've completely rewritten it in 2.0, and I encourage you to migrate your app now. https://x.com/anza_xyz/status/1819093864394764674

@jpablogn
Copy link
Author

jpablogn commented Aug 7, 2024

Yes, in fact that is the problem, but I don't know how to solve it.

If the program arrives to the line where is written "// TODO: Maybe add an 'errored' state or a retry limit?" (lines: 6.316 or 6.265) it goes to an infinite loop that crash the javascript stack.

I have modified rpc-websocket.ts to have visibility of readyState like this:

readyState(){
    // websocket CONNECTING (0) or OPEN (1)
    return ((this.underlyingSocket?.readyState === 1) || (this.underlyingSocket?.readyState === 0));
}

I have tried three different things without success:
Case 1: Include this piece of code in line 6.193:

if(!this._rpcWebSocket.readyState()){
      this._rpcWebSocket.connect();
      return;
}

Case 2. Add it to isCurrentConnectionStillActive() like this:

const activeWebSocketGeneration = this._rpcWebSocketGeneration;
    const isCurrentConnectionStillActive = () => {
      return ((activeWebSocketGeneration === this._rpcWebSocketGeneration) && (this._rpcWebSocket.readyState()));
 };

Case 3. Add this piece of code in lines 6.316 and 6.265:

if (!this._rpcWebSocket.readyState()){
   return;
}

Running the test npm run test:test-with-live-validator gives following results:

  1. Source code (without changes):
  490 passing (1m)
  3 pending
  7 failing
  1. Case 1:
  310 passing (1m)
  3 pending
  62 failing
  1. Case 2:
  476 passing (1m)
  3 pending
  21 failing
  1. Case 3:
  474 passing (1m)
  3 pending
  23 failing

Running npm run test:unit:node gives following results:

  1. Source code (without changes):
  471 passing (1s)
  1. Case 1:
  Tests not finished
  1. Case 2:
 Test not finished
  1. Case 3:
 Test not finished

Adding a return just in line 6311 passes all test successfully but I don't know the real effect (as far as the original code also passes al of them but fails in produccion when doing a websocket call when readyState != 1):

if (e instanceof Error) {
   console.error(`${unsubscribeMethod} error:`, e.message);
   return;
}

I will test it over a real server because I don't have the knowledge to do the test.

By the way I think lines 6.297-6.299 are repeated and can be deleted:
https://github.com/solana-labs/solana-web3.js/blob/maintenance/v1.x/src/connection.ts#L6297-L6304

Using a totaly refactored library means investing lots of hours in learning how it works (take into account that there are not many examples published to use as reference), for a big project may have sense, for an small one may be it is faster to do your own small library instead of trying to learn how such a complex one works (talking from my ignorant point of view).

Thank you.

@jpablogn
Copy link
Author

jpablogn commented Aug 9, 2024

This in not working fine in my code, it just gives back undefined.... :)

readyState(){
    // websocket CONNECTING (0) or OPEN (1)
    return this.underlyingSocket?.readyState;
}

But even in that case, this code:

} catch (e) {
      if (e instanceof Error) {
        console.error(`${unsubscribeMethod} error:`, e.message);
      }

      // TODO: Maybe add an 'errored' state or a retry limit?
      this._setSubscription(hash, {...subscription, state: 'subscribed'});    

      console.log(`${`-----> hash: ${aux}, ws.connected: ${this._rpcWebSocketConnected}, ` +
        `ws.readyState: ${this._rpcWebSocket.readyState()}, ` +
        `isActive: ${isCurrentConnectionStillActive()}`);    

      await this._updateSubscriptions();
}

prints: true, undefined, true

@steveluscher steveluscher added the 1.x Pertains to the v1.x line on the maintenance/v1.x branch label Oct 24, 2024
@jpablogn
Copy link
Author

jpablogn commented Nov 22, 2024 via email

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
1.x Pertains to the v1.x line on the maintenance/v1.x branch bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants