Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ourlogs): Allow log ingestion behind a flag #4448

Open
wants to merge 28 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
5fb547f
feat(ourlogs): Allow log ingestion behind a flag
k-fish Jan 15, 2025
97b7590
Update consts file
k-fish Jan 15, 2025
6dc83ef
Update changelog
k-fish Jan 15, 2025
3f6f6f9
Re-add flag to processing and filter
k-fish Jan 15, 2025
817a9fd
Update data category names
k-fish Jan 16, 2025
87f4882
feat(ourlogs): Add data categories for log ingestion
k-fish Jan 16, 2025
20e75e4
Add changelog
k-fish Jan 16, 2025
177d523
Merge branch 'feat/ourlogs/add-data-categories' into feat/ourlogs/ing…
k-fish Jan 16, 2025
7cbe5c9
Use enum
k-fish Jan 17, 2025
413338c
Default to stricter with pii on any user provided field
k-fish Jan 16, 2025
99b1d60
Remove extra drop
k-fish Jan 17, 2025
f92a726
Update relay-ourlogs/src/lib.rs
k-fish Jan 17, 2025
568bde8
Remove extra code for flag in process
k-fish Jan 17, 2025
077d50e
Add scrubbing
k-fish Jan 17, 2025
240b92b
Wrong type
k-fish Jan 17, 2025
153c0bc
Pass payload through as raw bytes
k-fish Jan 17, 2025
12127e1
Fix default topic test error
k-fish Jan 17, 2025
7c2bd4d
Fix serializing back out into AnyValue type format
k-fish Jan 17, 2025
bbe569d
Fix enforcing rate limits
k-fish Jan 17, 2025
1ab81d3
Update relay-server/src/services/processor/ourlog.rs
k-fish Jan 17, 2025
06b30de
Update relay-server/src/services/processor/ourlog.rs
k-fish Jan 17, 2025
5c01989
Update relay-ourlogs/src/ourlog.rs
k-fish Jan 17, 2025
0748d2c
Update relay-server/src/services/store.rs
k-fish Jan 17, 2025
188f0ed
Remove as_str
k-fish Jan 17, 2025
535022c
Add outcomes only after kafka produce
k-fish Jan 17, 2025
57f578f
Add integration test just for OTelLog for now
k-fish Jan 17, 2025
484e540
Clean up optional imports
k-fish Jan 17, 2025
b2c1613
Lint
k-fish Jan 17, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
**Internal**

- Updates performance score calculation on spans and events to also store cdf values as measurements. ([#4438](https://github.com/getsentry/relay/pull/4438))
- Allow log ingestion behind a flag, only for internal use currently. ([#4448](https://github.com/getsentry/relay/pull/4448))

## 24.12.2

Expand Down
16 changes: 16 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ relay-kafka = { path = "relay-kafka" }
relay-log = { path = "relay-log" }
relay-metrics = { path = "relay-metrics" }
relay-monitors = { path = "relay-monitors" }
relay-ourlogs = { path = "relay-ourlogs" }
relay-pattern = { path = "relay-pattern" }
relay-pii = { path = "relay-pii" }
relay-profiling = { path = "relay-profiling" }
Expand Down
3 changes: 2 additions & 1 deletion py/sentry_relay/consts.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@


class DataCategory(IntEnum):
# begin generated
DEFAULT = 0
ERROR = 1
TRANSACTION = 2
Expand All @@ -32,6 +31,8 @@ class DataCategory(IntEnum):
REPLAY_VIDEO = 20
UPTIME = 21
ATTACHMENT_ITEM = 22
LOG_COUNT = 23
LOG_BYTES = 24
UNKNOWN = -1
# end generated

Expand Down
13 changes: 13 additions & 0 deletions relay-base-schema/src/data_category.rs
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,15 @@ pub enum DataCategory {
Uptime = 21,
/// Counts the number of individual attachments, as opposed to the number of bytes in an attachment.
AttachmentItem = 22,
/// LogCount
///
/// This is the category for logs for which we store the count log events for users for measuring
/// missing breadcrumbs, and count of logs for rate limiting purposes.
LogCount = 23,
/// LogBytes
///
/// This is the category for logs for which we store log event total bytes for users.
LogBytes = 24,
k-fish marked this conversation as resolved.
Show resolved Hide resolved
k-fish marked this conversation as resolved.
Show resolved Hide resolved
//
// IMPORTANT: After adding a new entry to DataCategory, go to the `relay-cabi` subfolder and run
// `make header` to regenerate the C-binding. This allows using the data category from Python.
Expand Down Expand Up @@ -120,6 +129,8 @@ impl DataCategory {
"transaction_indexed" => Self::TransactionIndexed,
"monitor" => Self::Monitor,
"span" => Self::Span,
"log_count" => Self::LogCount,
"log_bytes" => Self::LogBytes,
"monitor_seat" => Self::MonitorSeat,
"feedback" => Self::UserReportV2,
"user_report_v2" => Self::UserReportV2,
Expand Down Expand Up @@ -152,6 +163,8 @@ impl DataCategory {
Self::TransactionIndexed => "transaction_indexed",
Self::Monitor => "monitor",
Self::Span => "span",
Self::LogCount => "log_count",
Self::LogBytes => "log_bytes",
Self::MonitorSeat => "monitor_seat",
Self::UserReportV2 => "feedback",
Self::MetricBucket => "metric_bucket",
Expand Down
17 changes: 15 additions & 2 deletions relay-cabi/include/relay.h
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
#ifndef RELAY_H_INCLUDED
#define RELAY_H_INCLUDED

/* Generated with cbindgen:0.26.0 */
/* Generated with cbindgen:0.27.0 */

/* Warning, this file is autogenerated. Do not modify this manually. */

Expand Down Expand Up @@ -142,6 +142,19 @@ enum RelayDataCategory {
* Counts the number of individual attachments, as opposed to the number of bytes in an attachment.
*/
RELAY_DATA_CATEGORY_ATTACHMENT_ITEM = 22,
/**
* LogCount
*
* This is the category for logs for which we store the count log events for users for measuring
* missing breadcrumbs, and count of logs for rate limiting purposes.
*/
RELAY_DATA_CATEGORY_LOG_COUNT = 23,
/**
* LogBytes
*
* This is the category for logs for which we store log event total bytes for users.
*/
RELAY_DATA_CATEGORY_LOG_BYTES = 24,
/**
* Any other data category not known by this Relay.
*/
Expand Down Expand Up @@ -679,4 +692,4 @@ struct RelayStr normalize_cardinality_limit_config(const struct RelayStr *value)
*/
struct RelayStr relay_normalize_global_config(const struct RelayStr *value);

#endif /* RELAY_H_INCLUDED */
#endif /* RELAY_H_INCLUDED */
3 changes: 3 additions & 0 deletions relay-cogs/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,8 @@ pub enum AppFeature {
Transactions,
/// Errors.
Errors,
/// Logs.
Logs,
/// Spans.
Spans,
/// Sessions.
Expand Down Expand Up @@ -159,6 +161,7 @@ impl AppFeature {
Self::Transactions => "transactions",
Self::Errors => "errors",
Self::Spans => "spans",
Self::Logs => "our_logs",
Self::Sessions => "sessions",
Self::ClientReports => "client_reports",
Self::CheckIns => "check_ins",
Expand Down
8 changes: 8 additions & 0 deletions relay-config/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -617,6 +617,8 @@ pub struct Limits {
/// The maximum payload size for a profile
pub max_profile_size: ByteSize,
/// The maximum payload size for a span.
pub max_log_size: ByteSize,
/// The maximum payload size for a span.
pub max_span_size: ByteSize,
/// The maximum payload size for a statsd metric.
pub max_statsd_size: ByteSize,
Expand Down Expand Up @@ -677,6 +679,7 @@ impl Default for Limits {
max_api_file_upload_size: ByteSize::mebibytes(40),
max_api_chunk_upload_size: ByteSize::mebibytes(100),
max_profile_size: ByteSize::mebibytes(50),
max_log_size: ByteSize::mebibytes(1),
max_span_size: ByteSize::mebibytes(1),
max_statsd_size: ByteSize::mebibytes(1),
max_metric_buckets_size: ByteSize::mebibytes(1),
Expand Down Expand Up @@ -2206,6 +2209,11 @@ impl Config {
self.values.limits.max_check_in_size.as_bytes()
}

/// Returns the maximum payload size of a log in bytes.
pub fn max_log_size(&self) -> usize {
self.values.limits.max_log_size.as_bytes()
}

/// Returns the maximum payload size of a span in bytes.
pub fn max_span_size(&self) -> usize {
self.values.limits.max_span_size.as_bytes()
Expand Down
6 changes: 5 additions & 1 deletion relay-dynamic-config/src/feature.rs
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,11 @@ pub enum Feature {
/// Serialized as `organizations:ingest-spans-in-eap`
#[serde(rename = "organizations:ingest-spans-in-eap")]
IngestSpansInEap,

/// Enable log ingestion for our log product (this is not internal logging).
///
/// Serialized as `organizations:ourlogs-ingestion`.
#[serde(rename = "organizations:ourlogs-ingestion")]
OurLogsIngestion,
/// This feature has graduated and is hard-coded for external Relays.
#[doc(hidden)]
#[serde(rename = "projects:profiling-ingest-unsampled-profiles")]
Expand Down
2 changes: 2 additions & 0 deletions relay-event-schema/src/processor/attrs.rs
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ pub enum ValueType {
Message,
Thread,
Breadcrumb,
OurLog,
Span,
ClientSdkInfo,

Expand Down Expand Up @@ -84,6 +85,7 @@ relay_common::derive_fromstr_and_display!(ValueType, UnknownValueTypeError, {
ValueType::Message => "message",
ValueType::Thread => "thread",
ValueType::Breadcrumb => "breadcrumb",
ValueType::OurLog => "ourlog",
ValueType::Span => "span",
ValueType::ClientSdkInfo => "sdk",
ValueType::Minidump => "minidump",
Expand Down
1 change: 1 addition & 0 deletions relay-event-schema/src/processor/traits.rs
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ pub trait Processor: Sized {
process_method!(process_breadcrumb, crate::protocol::Breadcrumb);
process_method!(process_template_info, crate::protocol::TemplateInfo);
process_method!(process_header_name, crate::protocol::HeaderName);
process_method!(process_ourlog, crate::protocol::OurLog);
process_method!(process_span, crate::protocol::Span);
process_method!(process_trace_context, crate::protocol::TraceContext);
process_method!(process_native_image_path, crate::protocol::NativeImagePath);
Expand Down
2 changes: 2 additions & 0 deletions relay-event-schema/src/protocol/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ mod mechanism;
mod metrics;
mod metrics_summary;
mod nel;
mod ourlog;
mod relay_info;
mod replay;
mod request;
Expand Down Expand Up @@ -54,6 +55,7 @@ pub use self::mechanism::*;
pub use self::metrics::*;
pub use self::metrics_summary::*;
pub use self::nel::*;
pub use self::ourlog::*;
pub use self::relay_info::*;
pub use self::replay::*;
pub use self::request::*;
Expand Down
138 changes: 138 additions & 0 deletions relay-event-schema/src/protocol/ourlog.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
use relay_protocol::{Annotated, Empty, FromValue, IntoValue, Object, Value};

use crate::processor::ProcessValue;
use crate::protocol::{SpanId, TraceId};

#[derive(Clone, Debug, Default, PartialEq, Empty, FromValue, IntoValue, ProcessValue)]
#[metastructure(process_func = "process_ourlog", value_type = "OurLog")]
pub struct OurLog {
/// Time when the event occurred.
#[metastructure(required = true, trim = false)]
pub timestamp_nanos: Annotated<u64>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other data types accept both unix timestamps and formatted date strings, although they do not have nanosecond precision:

/// Timestamp when the span was ended.
#[metastructure(required = true, trim = false)]
pub timestamp: Annotated<Timestamp>,

Are nanos required for logs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTel defines them as nanos, the consumers will consume nanos, and they are stored as nanos, we may need breadcrumbs to switch from floats to nanos but otherwise I think we are leaning towards keeping it the same format throughout instead of having slightly different intermediate formats.


/// Time when the event was observed.
#[metastructure(required = true, trim = false)]
pub observed_timestamp_nanos: Annotated<u64>,

/// The ID of the trace the log belongs to.
#[metastructure(required = false, trim = false)]
pub trace_id: Annotated<TraceId>,
/// The Span id.
///
#[metastructure(required = false, trim = false)]
pub span_id: Annotated<SpanId>,

/// Trace flag bitfield.
#[metastructure(required = false)]
pub trace_flags: Annotated<f64>,

/// This is the original string representation of the severity as it is known at the source
#[metastructure(required = false, max_chars = 32, pii = "maybe", trim = false)]
k-fish marked this conversation as resolved.
Show resolved Hide resolved
pub severity_text: Annotated<String>,

/// Numerical representation of the severity level
#[metastructure(required = false)]
pub severity_number: Annotated<i64>,

/// Log body.
#[metastructure(required = true, pii = "maybe", trim = false)]
pub body: Annotated<String>,

/// Arbitrary attributes on a log.
#[metastructure(pii = "maybe", trim = false)]
pub attributes: Annotated<Object<AttributeValue>>,

/// Additional arbitrary fields for forwards compatibility.
#[metastructure(additional_properties, retain = true, pii = "maybe", trim = false)]
pub other: Object<Value>,
}

#[derive(Debug, Clone, Default, PartialEq, Empty, FromValue, IntoValue, ProcessValue)]
pub struct AttributeValue {
pub string_value: Annotated<Value>,
pub int_value: Annotated<Value>,
pub double_value: Annotated<Value>,
pub bool_value: Annotated<Value>,
}
k-fish marked this conversation as resolved.
Show resolved Hide resolved

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn test_ourlog_serialization() {
let json = r#"{
"timestamp_nanos": 1544712660300000000,
"observed_timestamp_nanos": 1544712660300000000,
"severity_number": 10,
"severity_text": "Information",
"trace_id": "5b8efff798038103d269b633813fc60c",
"span_id": "eee19b7ec3c1b174",
"body": "Example log record",
"attributes": {
"string.attribute": {
"string_value": "some string"
},
"boolean.attribute": {
"bool_value": true
},
"int.attribute": {
"int_value": 10
},
"double.attribute": {
"double_value": 637.704
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually the protocol we want (i.e. explicit types in the JSON key)? Wouldn't this be nicer?

  "attributes": {
    "string.attribute": "some string",
    "boolean.attribute": true,
    "int.attribute": 10,
    "double.attribute": 637.704,
  }

Or is it vital to distinguish between integers and floating points?

Copy link
Member Author

@k-fish k-fish Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm of the opinion we want to keep this as close to OTel compatible as possible so we don't have any subtle impedance mismatch bugs, unless there is a strong reason otherwise (I haven't considered this deeply though so I can be convinced otherwise).

For reference these follow AnyValue

}
}"#;

let mut attributes = Object::new();
attributes.insert(
"string.attribute".into(),
Annotated::new(AttributeValue {
string_value: Annotated::new(Value::String("some string".into())),
..Default::default()
}),
);
attributes.insert(
"boolean.attribute".into(),
Annotated::new(AttributeValue {
bool_value: Annotated::new(Value::Bool(true)),
..Default::default()
}),
);
attributes.insert(
"int.attribute".into(),
Annotated::new(AttributeValue {
int_value: Annotated::new(Value::I64(10)),
..Default::default()
}),
);
attributes.insert(
"double.attribute".into(),
Annotated::new(AttributeValue {
double_value: Annotated::new(Value::F64(637.704)),
..Default::default()
}),
);

let log = Annotated::new(OurLog {
timestamp_nanos: Annotated::new(1544712660300000000),
observed_timestamp_nanos: Annotated::new(1544712660300000000),
severity_number: Annotated::new(10),
severity_text: Annotated::new("Information".to_string()),
trace_id: Annotated::new(TraceId("5b8efff798038103d269b633813fc60c".into())),
span_id: Annotated::new(SpanId("eee19b7ec3c1b174".into())),
body: Annotated::new("Example log record".to_string()),
attributes: Annotated::new(attributes),
..Default::default()
});

let expected: serde_json::Value = serde_json::from_str(json).unwrap();
let actual: serde_json::Value =
serde_json::from_str(&log.to_json_pretty().unwrap()).unwrap();
assert_eq!(expected, actual);

let log_from_string = Annotated::<OurLog>::from_json(json).unwrap();
assert_eq!(log, log_from_string);
}
}
Loading
Loading