Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add controller-isleader xweb component to respond with 200 if the controller in question is a raft leader #2581

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

nenkoru
Copy link

@nenkoru nenkoru commented Dec 9, 2024

closes(backref) #2582

@nenkoru
Copy link
Author

nenkoru commented Dec 10, 2024

moved from /sys/health to a /controller/health with changes made to make it more generic

@plorenz
Copy link
Member

plorenz commented Dec 13, 2024

Hi @nenkoru , before going too deep with this PR, we require a CLA before accepting contributions. Information on that is here: https://openziti.io/policies/CONTRIBUTING.html#code-contributions
Let us know if you're ok with submitting that.

@nenkoru
Copy link
Author

nenkoru commented Dec 17, 2024

Hi @nenkoru , before going too deep with this PR, we require a CLA before accepting contributions. Information on that is here: https://openziti.io/policies/CONTRIBUTING.html#code-contributions Let us know if you're ok with submitting that.

Hey, @plorenz!
Sent a signed ICLA to [email protected] yesterday.
Can't wait for processing to complete!

@plorenz
Copy link
Member

plorenz commented Dec 18, 2024

Hi @nenkoru , before going too deep with this PR, we require a CLA before accepting contributions. Information on that is here: https://openziti.io/policies/CONTRIBUTING.html#code-contributions Let us know if you're ok with submitting that.

Hey, @plorenz! Sent a signed ICLA to [email protected] yesterday. Can't wait for processing to complete!

CLA looks good 👍

Having looked at it again, I think it looks overall. The only 2 thing I'd like changed are:

  1. A flag to disable the 429 return so by default the results will be backwards compabible
  2. This should replace the existing health checker, rather than add to it. I'd rather have one flexible endpoint than multiple, since they are largely the same.

Let me know if those make sense.

@nenkoru
Copy link
Author

nenkoru commented Dec 18, 2024

Hi @nenkoru , before going too deep with this PR, we require a CLA before accepting contributions. Information on that is here: https://openziti.io/policies/CONTRIBUTING.html#code-contributions Let us know if you're ok with submitting that.

Hey, @plorenz! Sent a signed ICLA to [email protected] yesterday. Can't wait for processing to complete!

CLA looks good 👍

Having looked at it again, I think it looks overall. The only 2 thing I'd like changed are:

  1. A flag to disable the 429 return so by default the results will be backwards compabible
  2. This should replace the existing health checker, rather than add to it. I'd rather have one flexible endpoint than multiple, since they are largely the same.

Let me know if those make sense.

I think it gets complicated with moving this logic to a health-checks handler.
We need that controller env within a healtcheck, so we have to pass it into the structure of health-check, make it depend on the controller/env.
But router.go https://github.com/openziti/ziti/blob/main/router/router.go#L274 uses the same health-check and we have to pass the env, but it's impossible due to different types (controller vs router)

We might also opt for moving this heath-check to the actual gosundheit custom check https://github.com/openziti/ziti/blob/main/controller/health.go#L15, and try to find the result of the check within that health-check loop at https://github.com/openziti/ziti/blob/main/common/health/handler.go#L99 and decide whether to return 429 status code there.

Or we could just make health-checks specific for controller.

Thoughts?

@plorenz
Copy link
Member

plorenz commented Dec 19, 2024

Or we could just make health-checks specific for controller.

Yes, this was the direction I was thinking. Leave the router health check as it is and replace the use of the common health checker with this, controller specific one.

So the only changes required would be to

  1. remove the old of the existing health checker for the controller
  2. ensure that the new health check code is backwards compatible (uses the same paths, returns, etc, unless an indicator (request param or path) says that we should do the 429 return if not leader

@nenkoru
Copy link
Author

nenkoru commented Dec 24, 2024

Or we could just make health-checks specific for controller.

Yes, this was the direction I was thinking. Leave the router health check as it is and replace the use of the common health checker with this, controller specific one.

So the only changes required would be to

  1. remove the old of the existing health checker for the controller
  2. ensure that the new health check code is backwards compatible (uses the same paths, returns, etc, unless an indicator (request param or path) says that we should do the 429 return if not leader

Made changes accordingly. Naming was hard, but I named this as enableRaftControllerCheck. This flag makes it to do two things: Return 429 if not a raft leader and raft is enabled & Return isLeader and isRaftEnabled within raft. Making this essentially a feature flag.

Eg response:

< HTTP/1.1 429 Too Many Requests
< Content-Type: application/json
< Date: Tue, 24 Dec 2024 21:47:33 GMT
< Content-Length: 408
<
{
    "data": {
        "checks": [
            {
                "details": null,
                "healthy": true,
                "id": "bolt.read",
                "lastCheckDuration": "17.334µs",
                "lastCheckTime": "2024-12-24T21:47:24Z"
            }
        ],
        "healthy": true
    },
    "meta": {},
    "raft": {
        "isLeader": false,
        "isRaftEnabled": true
    }
}

Also I skimmed the zititest but couldn't an appropriate place to put tests for this behaviour.

I guess after tests created I could unmark this PR as being a draft and we could continue iterating on this!
Tell me what you think.

@plorenz
Copy link
Member

plorenz commented Jan 7, 2025

Hi @nenkoru

Sorry for the late response, I was out for a couple of weeks at the end of the year.

It's looking pretty good. I only have a couple of questions.

First, can you make sure your commits are signed, please? (see https://docs.github.com/en/authentication/managing-commit-signature-verification/signing-commits).

Second, how would you feel about making the enableRaftControllerCheck based off either a query param or path difference? That would allow someone to use the same health check to monitor overall health as well as for detecting leadership changes. I can see someone wanting to do both.
Or were you thinking they'd run the health check at two different endpoints, one for each behavior?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants