Skip to content

Commit

Permalink
Update spec to include open telemetry
Browse files Browse the repository at this point in the history
  • Loading branch information
benjamin-confino committed Apr 17, 2024
1 parent 475b3ac commit 5ab92d7
Show file tree
Hide file tree
Showing 2 changed files with 56 additions and 30 deletions.
78 changes: 48 additions & 30 deletions spec/src/main/asciidoc/metrics.asciidoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
//
// Copyright (c) 2018-2020 Contributors to the Eclipse Foundation
// Copyright (c) 2018-2024 Contributors to the Eclipse Foundation
//
// See the NOTICE file(s) distributed with this work for additional
// information regarding copyright ownership.
Expand All @@ -18,12 +18,16 @@
// Contributors:
// Andrew Rouse
// Jan Bernitt
// Benjamin Confino

== Integration with MicroProfile Metrics
== Integration with MicroProfile Metrics and MicroProfile Telemetry

When Microprofile Fault Tolerance and Microprofile Metrics are used together, metrics are automatically added for each of
When MicroProfile Fault Tolerance is used togeather with MicroProfile Metrics or MicroProfile Telemetry, metrics are automatically added for each of
the methods annotated with a `@Retry`, `@Timeout`, `@CircuitBreaker`, `@Bulkhead` or `@Fallback` annotation.

If all three of MicroProfile Fault Tolerance, MicroProfile Metrics, and MicroProfile Telemetry are used togeather then MicroProfile Fault Tolerance
exports metrics to both MicroProfile Metrics and MicroProfile Telemetry.

=== Names

The automatically added metrics follow a consistent pattern which includes the fully qualified name of the annotated method.
Expand All @@ -33,7 +37,9 @@ is non-portable and may vary between implementations. For portable behavior, mon

=== Scope

Metrics added by this specification will appear in the `base` MicroProfile Metrics scope.
In MicroProfile Metrics, metrics added by this specification will appear in the `base` MicroProfile Metrics scope.

In MicroProfile Telemetry, the generation of metrics is bounded to individual applications.

=== Registration

Expand All @@ -44,11 +50,12 @@ Policies that have been disabled through configuration do not cause registration

Implementations must ensure that if any of these annotations are present on a method, then the following metrics are added only once for that method.

[cols="1,5"]
[cols="2,4"]
|===
| Name | `ft.invocations.total`

| Type | `Counter`
| Type in mpMetrics | `Counter`
| Type in mpTelemetry | `LongCounter`
| Unit | None
| Description | The number of times the method was called
| Tags
Expand All @@ -59,11 +66,12 @@ a| * `method` - the fully qualified method name

=== Metrics added for `@Retry`

[cols="1,5"]
[cols="2,4"]
|===
| Name | `ft.retry.calls.total`

| Type | `Counter`
| Type in mpMetrics | `Counter`
| Type in mpTelemetry | `LongCounter`
| Unit | None
| Description | The number of times the retry logic was run. This will always be once per method call.
| Tags
Expand All @@ -72,11 +80,12 @@ a| * `method` - the fully qualified method name
* `retryResult` = `[valueReturned\|exceptionNotRetryable\|maxRetriesReached\|maxDurationReached]` - the reason that last attempt to call the method was not retried
|===

[cols="1,5"]
[cols="2,4"]
|===
| Name | `ft.retry.retries.total`

| Type | `Counter`
| Type in mpMetrics | `Counter`
| Type in mpTelemetry | `LongCounter`
| Unit | None
| Description | The number of times the method was retried
| Tags
Expand All @@ -85,23 +94,25 @@ a| * `method` - the fully qualified method name

=== Metrics added for `@Timeout`

[cols="1,5"]
[cols="2,4"]
|===
| Name | `ft.timeout.calls.total`

| Type | `Counter`
| Type in mpMetrics | `Counter`
| Type in mpTelemetry | `LongCounter`
| Unit | None
| Description | The number of times the timeout logic was run. This will usually be once per method call, but may be zero times if the circuit breaker prevents execution or more than once if the method is retried.
| Tags
a| * `method` - the fully qualified method name
* `timedOut` = `[true\|false]` - whether the method call timed out
|===

[cols="1,5"]
[cols="2,4"]
|===
| Name | `ft.timeout.executionDuration`

| Type | `Histogram`
| Type in mpMetrics | `Histogram`
| Type in mpTelemetry | `LongHistogram`
| Unit | Nanoseconds
| Description | Histogram of execution times for the method
| Tags
Expand All @@ -110,11 +121,12 @@ a| * `method` - the fully qualified method name

=== Metrics added for `@CircuitBreaker`

[cols="1,5"]
[cols="2,4"]
|===
| Name | `ft.circuitbreaker.calls.total`

| Type | `Counter`
| Type in mpMetrics | `Counter`
| Type in mpTelemetry | `LongCounter`
| Unit | None
| Description | The number of times the circuit breaker logic was run. This will usually be once per method call, but may be more than once if the method call is retried.
| Tags
Expand All @@ -125,11 +137,12 @@ a| * `method` - the fully qualified method name
** `circuitBreakerOpen` - the method did not run because the circuit breaker was in open or half-open state
|===

[cols="1,5"]
[cols="2,4"]
|===
| Name | `ft.circuitbreaker.state.total`

| Type | `Gauge<Long>`
| Type in mpMetrics | `Gauge<Long>`
| Type in mpTelemetry | `ObservableLongGauge`
| Unit | Nanoseconds
| Description | Amount of time the circuit breaker has spent in each state
| Tags
Expand All @@ -138,7 +151,7 @@ a| * `method` - the fully qualified method name
| Notes | Although this metric is a `Gauge`, its value increases monotonically.
|===

[cols="1,5"]
[cols="2,4"]
|===
| Name | `ft.circuitbreaker.opened.total`

Expand All @@ -151,57 +164,62 @@ a| * `method` - the fully qualified method name

=== Metrics added for `@Bulkhead`

[cols="1,5"]
[cols="2,4"]
|===
| Name | `ft.bulkhead.calls.total`

| Type | `Counter`
| Type in mpMetrics | `Counter`
| Type in mpTelemetry | `LongCounter`
| Unit | None
| Description | The number of times the bulkhead logic was run. This will usually be once per method call, but may be zero times if the circuit breaker prevented execution or more than once if the method call is retried.
| Tags
a| * `method` - the fully qualified method name
* `bulkheadResult` = `[accepted\|rejected]` - whether the bulkhead allowed the method call to run
|===

[cols="1,5"]
[cols="2,4"]
|===
| Name | `ft.bulkhead.executionsRunning`

| Type | `Gauge<Long>`
| Type in mpMetrics | `Gauge<Long>`
| Type in mpTelemetry | `ObservableLongGauge`
| Unit | None
| Description | Number of currently running executions
| Tags
a| * `method` - the fully qualified method name
|===

[cols="1,5"]
[cols="2,4"]
|===
| Name | `ft.bulkhead.executionsWaiting`

| Type | `Gauge<Long>`
| Type in mpMetrics | `Gauge<Long>`
| Type in mpTelemetry | `ObservableLongGauge`
| Unit | None
| Description | Number of executions currently waiting in the queue
| Tags
a| * `method` - the fully qualified method name
| Notes | Only added if the method is also annotated with `@Asynchronous`
|===

[cols="1,5"]
[cols="2,4"]
|===
| Name | `ft.bulkhead.runningDuration`

| Type | `Histogram`
| Type in mpMetrics | `Histogram`
| Type in mpTelemetry | `LongHistogram`
| Unit | Nanoseconds
| Description | Histogram of the time that method executions spent running
| Tags
a| * `method` - the fully qualified method name
|===

[cols="1,5"]
[cols="2,4"]
|===
| Name | `ft.bulkhead.waitingDuration`

| Type | `Histogram`
| Type in mpMetrics | `Histogram`
| Type in mpTelemetry | `LongHistogram`
| Unit | Nanoseconds
| Description | Histogram of the time that method executions spent waiting in the queue
| Tags
Expand All @@ -213,7 +231,7 @@ a| * `method` - the fully qualified method name
=== Notes

Future versions of this specification may change the definitions of the metrics which are added to take advantage of
enhancements in the MicroProfile Metrics specification.
enhancements in the MicroProfile Metrics or MicroProfile Telemetry specification.

If more than one annotation is applied to a method, the metrics associated with each annotation will be added for that method.

Expand Down
8 changes: 8 additions & 0 deletions spec/src/main/asciidoc/relationship.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -68,3 +68,11 @@ The MicroProfile Metrics specification provides a way to monitor microservice in
* When `Timeout` is used, you would like to know how many times the method timed out.

Because of this requirement, when MicroProfile Fault Tolerance and MicroProfile Metrics are used together, metrics are automatically added for each of the methods annotated with a `@Retry`, `@Timeout`, `@CircuitBreaker`, `@Bulkhead` or `@Fallback` annotation.

=== Relationship to MicroProfile Metrics
The MicroProfile Telemetry specification provides a way to monitor microservice invocations. It is also important to find out how Fault Tolerance policies are operating, e.g.

* When `Retry` is used, it is useful to know how many times a method was called and succeeded after retrying at least once.
* When `Timeout` is used, you would like to know how many times the method timed out.

Because of this requirement, when MicroProfile Fault Tolerance and MicroProfile Telemetry are used together, metrics are automatically added for each of the methods annotated with a `@Retry`, `@Timeout`, `@CircuitBreaker`, `@Bulkhead` or `@Fallback` annotation.

0 comments on commit 5ab92d7

Please sign in to comment.