Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose prometheus metrics at ScaledJob like ScaledObject #4817

Closed
wants to merge 29 commits into from

Conversation

yoongon
Copy link
Contributor

@yoongon yoongon commented Jul 24, 2023

Provide a description of what has been changed
Expose prometheus metrics at ScaledJob like ScaledObject
Refer to : #4798

Checklist

Fixes #
#4798

Relates to #

@yoongon yoongon requested a review from a team as a code owner July 24, 2023 11:40
@github-actions
Copy link

Thank you for your contribution! 🙏 We will review your PR as soon as possible.

🏖️ Over the summer, the response time will be longer than usual due to maintainers taking time off so please bear with us.

While you are waiting, make sure to:

Learn more about:

@yoongon yoongon force-pushed the feature/scaler-prometheus branch from b182ddb to 9a45df4 Compare July 24, 2023 11:43
@yoongon yoongon changed the title Feature/scaler prometheus Expose prometheus metrics at ScaledJob like ScaledObject Jul 24, 2023
Copy link
Member

@JorTurFer JorTurFer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution!

As the goal is adding new metrics, we should create new counters here for them and update those counters instead of reusing already existing counters for ScaledObjects. Otherwise, It could be confusing

@yoongon
Copy link
Contributor Author

yoongon commented Jul 27, 2023

Thanks for the feedback. I will update the code again.

@yoongon yoongon force-pushed the feature/scaler-prometheus branch from 90e34f5 to 175badd Compare July 30, 2023 07:22
@yoongon
Copy link
Contributor Author

yoongon commented Jul 30, 2023

Thanks for the contribution!

As the goal is adding new metrics, we should create new counters here for them and update those counters instead of reusing already existing counters for ScaledObjects. Otherwise, It could be confusing

@JorTurFer
=> I updated the code with new counters for ScaledJob. Could you check again? Thanks.

@yoongon yoongon requested a review from JorTurFer July 31, 2023 00:14
Copy link
Member

@JorTurFer JorTurFer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't rename the already existing metrics because that's a breaking change and our deprecation policy requires 2 version between deprecation and removal.
I'm thinking in 2 different option for this:

  • Use a new label to inform if it's ScaledObject or ScaledJobs on metrics where the name isn't clear, such as 'keda_scaler_active' or keda_scaler_metrics_value, and duplicating the metrics with clear name, for example keda_scaled_object_errors with keda_scaled_job_errors
  • Duplicate all the metrics for the ScaledJobs

In both cases, we should maintain current metrics, but I'm not sure about which is the best approach for new metrics, I don't have any strong opinion but personally I prefer the labels option. WDYT @kedacore/keda-core-contributors ?

In any case, we have to add e2e test coverage to these metrics: https://github.com/kedacore/keda/blob/main/tests/sequential/prometheus_metrics/prometheus_metrics_test.go

Comment on lines 70 to 78
scaledJobScalerMetricsValue = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Namespace: DefaultPromMetricsNamespace,
Subsystem: "scaler",
Name: "metrics_latency",
Name: "scaledjob_metrics_value",
Help: "Metric Value used for HPA",
},
scaledJobmetricLabels,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This metrics isn't needed. The ScaledJob doesn't generate an HPA

@yoongon
Copy link
Contributor Author

yoongon commented Aug 1, 2023

We can't rename the already existing metrics because that's a breaking change and our deprecation policy requires 2 version between deprecation and removal. I'm thinking in 2 different option for this:

  • Use a new label to inform if it's ScaledObject or ScaledJobs on metrics where the name isn't clear, such as 'keda_scaler_active' or keda_scaler_metrics_value, and duplicating the metrics with clear name, for example keda_scaled_object_errors with keda_scaled_job_errors
  • Duplicate all the metrics for the ScaledJobs

In both cases, we should maintain current metrics, but I'm not sure about which is the best approach for new metrics, I don't have any strong opinion but personally I prefer the labels option. WDYT @kedacore/keda-core-contributors ?

In any case, we have to add e2e test coverage to these metrics: https://github.com/kedacore/keda/blob/main/tests/sequential/prometheus_metrics/prometheus_metrics_test.go

You are right. I should have taken a look, because It conflicts with document(https://keda.sh/docs/2.11/operate/prometheus/).
.Thanks for the catch. I will take a look at better solution.

@yoongon
Copy link
Contributor Author

yoongon commented Aug 1, 2023

  • Use a new label to inform if it's ScaledObject or ScaledJobs on metrics where the name isn't clear, such as 'keda_scaler_active' or keda_scaler_metrics_value, and duplicating the metrics with clear name, for example keda_scaled_object_errors with keda_scaled_job_errors

@JorTurFer
I have a question for the above comment. Are you suggesting these steps?

1. Preserve existing metrics such as **keda_scaler_metrics_value** for ScaledObject
2. Creating new metrics like **keda_scaler_scaled_object_metrics_value**  and **keda_scaler_scaled_jobe_metrics_value**
3. Deprecate **keda_scaler_metrics_value**  in the future.

Do I understand right?

@JorTurFer
Copy link
Member

JorTurFer commented Aug 24, 2023

I have a question for the above comment. Are you suggesting these steps?

1. Preserve existing metrics such as **keda_scaler_metrics_value** for ScaledObject
2. Creating new metrics like **keda_scaler_scaled_object_metrics_value**  and **keda_scaler_scaled_jobe_metrics_value**
3. Deprecate **keda_scaler_metrics_value**  in the future.

Do I understand right?

Sorry for the slow response (the summer break xD)

No, I meant that metrics like keda_scaler_metrics_value (which doesn't say anything related with ScaledObject) can be extended using labels for adding the extra information about if it's from ScaledObject or ScaledJob, duplicate them it's not necessary.

For labels whose have scaledobject in the naming such us keda_scaled_object_errors, we can duplicate the for scaledjobs , for example keda_scaled_object_errors and keda_scaled_job_errors. Or maybe unifying both in something like keda_scalable_object_errors, using labels to clarify if the metric is for so or sj.

For example, this metric is the same for both, splitting them based on type a label: https://github.com/kedacore/keda/blob/main/pkg/prommetrics/prommetrics.go#L124-L132

@yoongon yoongon force-pushed the feature/scaler-prometheus branch from d664edf to 9459287 Compare August 26, 2023 07:51
@yoongon
Copy link
Contributor Author

yoongon commented Aug 26, 2023

@JorTurFer
Thanks for the feedback. It becomes much clear for me.
I updated. PLTAL.
Thanks!

@yoongon yoongon force-pushed the feature/scaler-prometheus branch from b6167c2 to e571d99 Compare August 26, 2023 08:10
JorTurFer and others added 16 commits August 26, 2023 18:26
…acore#4839)

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Yoon Park <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Yoon Park <[email protected]>
Co-authored-by: Jorge Turrado <[email protected]>
Co-authored-by: Jorge Turrado <[email protected]>
Signed-off-by: Yoon Park <[email protected]>
* Add testing strategy document

Signed-off-by: Jorge Turrado <[email protected]>
Signed-off-by: Jorge Turrado <[email protected]>

* Update TESTING.md

Signed-off-by: Jorge Turrado <[email protected]>

* Update TESTING.md

Signed-off-by: Jorge Turrado <[email protected]>

* add extra info to tests

Signed-off-by: Jorge Turrado <[email protected]>

---------

Signed-off-by: Jorge Turrado <[email protected]>
Signed-off-by: Jorge Turrado <[email protected]>
Signed-off-by: Yoon Park <[email protected]>
Signed-off-by: SpiritZhou <[email protected]>
Signed-off-by: Yoon Park <[email protected]>
`s/deployment/statefulset` in `pkg/scaling/resolver/scaling_resolver.go`
for the StatefulSet scaling resolver error message.

Signed-off-by: David Morrison <[email protected]>
Signed-off-by: Yoon Park <[email protected]>
* Update

Signed-off-by: SpiritZhou <[email protected]>

* Update

Signed-off-by: SpiritZhou <[email protected]>

* Update

Signed-off-by: SpiritZhou <[email protected]>

* Update

Signed-off-by: SpiritZhou <[email protected]>

* Update CHANGELOG.md

Co-authored-by: Jorge Turrado Ferrero <[email protected]>
Signed-off-by: SpiritZhou <[email protected]>

* Update

Signed-off-by: SpiritZhou <[email protected]>

* Update

Signed-off-by: SpiritZhou <[email protected]>

* Update

Signed-off-by: SpiritZhou <[email protected]>

* Update

Signed-off-by: SpiritZhou <[email protected]>

* Update

Signed-off-by: SpiritZhou <[email protected]>

* Update

Signed-off-by: SpiritZhou <[email protected]>

* Update CHANGELOG.md

Signed-off-by: SpiritZhou <[email protected]>

---------

Signed-off-by: SpiritZhou <[email protected]>
Co-authored-by: Jorge Turrado Ferrero <[email protected]>
Signed-off-by: Yoon Park <[email protected]>
Signed-off-by: Yoon Park <[email protected]>
@yoongon yoongon force-pushed the feature/scaler-prometheus branch from 0fd9310 to 1e7c6b1 Compare August 26, 2023 09:27
@semgrep-app
Copy link

semgrep-app bot commented Aug 26, 2023

Semgrep found 8 third-party-action-not-pinned-to-commit-sha findings:

  • .github/workflows/pr-e2e.yml: L209, L191, L118, L75
  • .github/workflows/pr-e2e-creator.yml: L26, L17
  • .github/workflows/pr-e2e-checker.yml: L26, L17

An action sourced from a third-party repository on GitHub is not pinned to a full length commit SHA. Pinning an action to a full length commit SHA is currently the only way to use an action as an immutable release. Pinning to a particular SHA helps mitigate the risk of a bad actor adding a backdoor to the action's repository, as they would need to generate a SHA-1 collision for a valid Git object payload.

Ignore this finding from third-party-action-not-pinned-to-commit-sha.

@yoongon
Copy link
Contributor Author

yoongon commented Aug 26, 2023

Using GitHub sync break the changes.
I close this one and will file a new PR.

@yoongon yoongon closed this Aug 26, 2023
@yoongon
Copy link
Contributor Author

yoongon commented Aug 26, 2023

I opened a new PR here.
#4913

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.