Skip to content

Commit

Permalink
Update environment variables
Browse files Browse the repository at this point in the history
  • Loading branch information
dnnanuti committed Mar 13, 2024
1 parent f6d182b commit 70f90ec
Show file tree
Hide file tree
Showing 4 changed files with 71 additions and 72 deletions.
8 changes: 4 additions & 4 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
### New features

### Breaking changes
* Separate completely Rust logs and Python logs. Rust logs are configured through RUST_LOG,
S3_CONNECTOR_ENABLE_CRT_LOGS and S3_CONNECTOR_LOGS_DIR_PATH environment variables.

* Separate completely Rust logs and Python logs. Logs from Rust components, used for debugging purposes
are configured through the following environment variables: S3_TORCH_CONNECTOR_DEBUG_LOGS,
S3_TORCH_CONNECTOR_ENABLE_CRT_LOGS, S3_TORCH_CONNECTOR_LOGS_DIR_PATH.

## v1.2.0 (March 13, 2024)

Expand Down Expand Up @@ -82,7 +82,7 @@

### New features
* S3IterableDataset and S3MapDataset, which allow building either an iterable-style or map-style dataset, using your S3
stored data, by specifying an S3 URI (a bucket and optional prefix) and the region the bucket is in.
stored data, by specifying an S3 URI (a bucket and optional prefix) and the region the bucket is in.
* Support for multiprocess data loading for the above datasets.
* S3Checkpoint, an interface for saving and loading model checkpoints directly to and from an S3 bucket.

Expand Down
51 changes: 23 additions & 28 deletions doc/DEVELOPMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,34 +92,29 @@ Fill in the path of the Python executable in your virtual environment (`venv/bin
as the program argument.
Then put a breakpoint in the Rust/C code and try running it.

#### Enabling Logging
The Python logger handles the logging messages from the Python implementation.
The logs of our Rust components are handled through a
[tracing_subscriber](https://docs.rs/tracing-subscriber/latest/tracing_subscriber/).

You have the following configuration options to filter Rust log messages:
- Default - If you do not have `RUST_LOG` environment variable set up at all, Rust logging will cover `ERROR` level
of Mountpoint S3 Client Rust component.

- Mountpoint S3 Client logs - Configure [RUST_LOG](https://docs.rs/env_logger/latest/env_logger/#enabling-logging) variable. For example, setting
`RUST_LOG=debug` will enable logging of DEBUG messages from Mountpoint S3 Client.

- CRT logs - Configure `S3_CONNECTOR_ENABLE_CRT_LOGS` variable, similarly to
[RUST_LOG](https://docs.rs/env_logger/latest/env_logger/#enabling-logging). This will enable the logs from CRT
component and override the `RUST_LOG` setup.

**Please note that these logs are very noisy, do NOT enable them unless otherwise instructed.**

Additionally, you can set up `S3_CONNECTOR_LOGS_DIR_PATH` with the path to a local folder where you have WRITE
permissions. When this is set up, the Rust logs will be written at this location,
with `s3torchconnectorclient.log` prefix, rolling on an hourly basis.

Example:
- Enable Mountpoint S3 Client logs with DEBUG level to be written at `/tmp/s3torchconnector-logs`
#### Enabling Debug Logging
The Python logger handles the logging messages from the Python-side of our implementation.
For debug purposes, you can also enable the logs of our Rust components.
These are handled by [tracing_subscriber](https://docs.rs/tracing-subscriber/latest/tracing_subscriber/) and can be
configured through the following environment variables:
- S3_TORCH_CONNECTOR_DEBUG_LOGS - Configured similarly to
[RUST_LOG](https://docs.rs/env_logger/latest/env_logger/#enabling-logging) variable and used as an
[EnvFilter](https://docs.rs/tracing-subscriber/latest/tracing_subscriber/filter/struct.EnvFilter.html) for
filtering the logs in our Rust components.
- S3_TORCH_CONNECTOR_ENABLE_CRT_LOGS - Enables finer granularity logs from
[AWS Common Runtime (CRT)](https://docs.aws.amazon.com/sdkref/latest/guide/common-runtime.html) when is set to 1.
**Please note that these logs are very noisy, do NOT enable them unless otherwise instructed.**
- S3_TORCH_CONNECTOR_LOGS_DIR_PATH - Set up with the path to a local folder where you have WRITE permissions.
When this is configured, the logs from the Rust components will be appended to a file at this location, named like
`s3torchconnectorclient.log.YYYY-MM-DD-HH` and rolling on an hourly basis. The log messages of the latest run are
appended to the end of the most recent log file.

**Example**
- Enable TRACE level logs to be written at `/tmp/s3torchconnector-logs`:
```sh
export RUST_LOG=debug
export S3_CONNECTOR_LOGS_DIR_PATH="/tmp/s3torchconnector-logs"
export S3_TORCH_CONNECTOR_DEBUG_LOGS=trace
export S3_TORCH_CONNECTOR_ENABLE_CRT_LOGS=1
export S3_TORCH_CONNECTOR_LOGS_DIR_PATH="/tmp/s3torchconnector-logs"
python ./my_test.py
```
After running this, you will find the logs under `/tmp/s3torchconnector-logs`. The logging messages will be appended
to the most recent file of format `s3torchconnectorclient.log.YYYY-MM-DD-HH`.
After running this, you will find the logs under `/tmp/s3torchconnector-logs`.
48 changes: 25 additions & 23 deletions s3torchconnectorclient/python/tst/integration/test_logging.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,16 +16,16 @@
s3_uri = sys.argv[1]
region = sys.argv[2]
crt_rust_log = sys.argv[3]
default_rust_log = sys.argv[4]
enable_crt_logs = sys.argv[3]
debug_logs_config = sys.argv[4]
logs_dir_path = sys.argv[5]
if crt_rust_log != "":
os.environ["S3_CONNECTOR_ENABLE_CRT_LOGS"] = crt_rust_log
if default_rust_log != "":
os.environ["RUST_LOG"] = default_rust_log
if enable_crt_logs != "":
os.environ["S3_TORCH_CONNECTOR_ENABLE_CRT_LOGS"] = enable_crt_logs
if debug_logs_config != "":
os.environ["S3_TORCH_CONNECTOR_DEBUG_LOGS"] = debug_logs_config
if logs_dir_path != "":
os.environ["S3_CONNECTOR_LOGS_DIR_PATH"] = logs_dir_path
os.environ["S3_TORCH_CONNECTOR_LOGS_DIR_PATH"] = logs_dir_path
from s3torchconnector import S3MapDataset
Expand All @@ -44,7 +44,7 @@


@pytest.mark.parametrize(
"crt_rust_log, should_contain, should_not_contain",
"debug_logs_config, should_contain, should_not_contain",
[
(
"info",
Expand Down Expand Up @@ -78,12 +78,14 @@
],
)
def test_crt_logging(
crt_rust_log: str,
debug_logs_config: str,
should_contain: List[str],
should_not_contain: List[str],
image_directory: BucketPrefixFixture,
):
out, err = _start_subprocess(image_directory, crt_rust_log=crt_rust_log)
out, err = _start_subprocess(
image_directory, enable_crt_logs="1", debug_logs_config=debug_logs_config
)
assert err == ""
assert all(s in out for s in should_contain)
assert all(s not in out for s in should_not_contain)
Expand Down Expand Up @@ -115,27 +117,27 @@ def test_default_logging_env_filters_unset(image_directory: BucketPrefixFixture)


@pytest.mark.parametrize(
"crt_rust_log, default_rust_log, out_should_contain, out_should_not_contain, file_should_contain, file_should_not_contain",
"enable_crt_logs, debug_logs_config, out_should_contain, out_should_not_contain, file_should_contain, file_should_not_contain",
[
(
"1",
"INFO",
"",
["INFO s3torchconnector.s3map_dataset"],
["awscrt"],
["awscrt", "INFO"],
["INFO s3torchconnector.s3map_dataset", "DEBUG", "TRACE"],
),
(
"",
"0",
"DEBUG",
["INFO s3torchconnector.s3map_dataset"],
["awscrt"],
["DEBUG", "mountpoint_s3_client"],
["awscrt", "INFO s3torchconnector.s3map_dataset", "TRACE"],
),
(
"1",
"DEBUG",
"TRACE",
["INFO s3torchconnector.s3map_dataset"],
["awscrt"],
["DEBUG", "mountpoint_s3_client", "awscrt"],
Expand All @@ -144,8 +146,8 @@ def test_default_logging_env_filters_unset(image_directory: BucketPrefixFixture)
],
)
def test_logging_to_file(
crt_rust_log: str,
default_rust_log: str,
enable_crt_logs: str,
debug_logs_config: str,
out_should_contain: List[str],
out_should_not_contain: List[str],
file_should_contain: List[str],
Expand All @@ -156,8 +158,8 @@ def test_logging_to_file(
print("Created temporary directory", log_dir)
out, err = _start_subprocess(
image_directory,
crt_rust_log=crt_rust_log,
default_rust_log=default_rust_log,
enable_crt_logs=enable_crt_logs,
debug_logs_config=debug_logs_config,
logs_directory=log_dir,
)
# Standard output contains Python output
Expand All @@ -177,9 +179,9 @@ def test_logging_to_file(
def _start_subprocess(
image_directory: BucketPrefixFixture,
*,
crt_rust_log: str = "",
default_rust_log: str = "",
logs_directory: str = ""
enable_crt_logs: str = "",
debug_logs_config: str = "",
logs_directory: str = "",
):
process = subprocess.Popen(
[
Expand All @@ -188,8 +190,8 @@ def _start_subprocess(
PYTHON_TEST_CODE,
image_directory.s3_uri,
image_directory.region,
crt_rust_log,
default_rust_log,
enable_crt_logs,
debug_logs_config,
logs_directory,
],
stdout=subprocess.PIPE,
Expand Down
36 changes: 19 additions & 17 deletions s3torchconnectorclient/rust/src/logger_setup.rs
Original file line number Diff line number Diff line change
Expand Up @@ -9,21 +9,23 @@ use tracing_subscriber::{filter::EnvFilter};
use tracing_subscriber::util::{SubscriberInitExt};
use crate::exception::python_exception;

pub const ENABLE_CRT_LOGS_ENV_VAR: &str = "S3_CONNECTOR_ENABLE_CRT_LOGS";
pub const LOGS_DIR_PATH_ENV_VAR: &str = "S3_CONNECTOR_LOGS_DIR_PATH";
pub const S3_TORCH_CONNECTOR_DEBUG_LOGS_ENV_VAR: &str = "S3_TORCH_CONNECTOR_DEBUG_LOGS";
pub const S3_TORCH_CONNECTOR_ENABLE_CRT_LOGS_ENV_VAR: &str = "S3_TORCH_CONNECTOR_ENABLE_CRT_LOGS";
pub const S3_TORCH_CONNECTOR_LOGS_DIR_PATH_ENV_VAR: &str = "S3_TORCH_CONNECTOR_LOGS_DIR_PATH";
pub const LOG_FILE_PREFIX: &str = "s3torchconnectorclient.log";

pub fn setup_logging() -> PyResult<()> {
let enable_crt_logs = env::var(ENABLE_CRT_LOGS_ENV_VAR);
let mut filter= EnvFilter::from_default_env();
let enable_crt_logs = env::var(S3_TORCH_CONNECTOR_ENABLE_CRT_LOGS_ENV_VAR)
.unwrap_or_default() == "1";
let filter = EnvFilter::from_env(S3_TORCH_CONNECTOR_DEBUG_LOGS_ENV_VAR);

if enable_crt_logs.is_ok() {
if enable_crt_logs {
RustLogAdapter::try_init().map_err(python_exception)?;
filter = EnvFilter::from_env(ENABLE_CRT_LOGS_ENV_VAR);
}

let crt_logs_path = env::var(LOGS_DIR_PATH_ENV_VAR).ok();
let debug_logs_path = env::var(S3_TORCH_CONNECTOR_LOGS_DIR_PATH_ENV_VAR).ok();

match crt_logs_path {
match debug_logs_path {
Some(logs_path) => {
enable_file_logging(filter, logs_path)?;
}
Expand All @@ -36,7 +38,7 @@ pub fn setup_logging() -> PyResult<()> {
}

fn enable_file_logging(filter: EnvFilter, logs_path: String) -> PyResult<()> {
let logfile = tracing_appender::rolling::hourly(logs_path, "s3torchconnectorclient.log");
let logfile = tracing_appender::rolling::hourly(logs_path, LOG_FILE_PREFIX);
let subscriber_builder = tracing_subscriber::fmt()
.with_writer(logfile)
.with_env_filter(filter)
Expand All @@ -60,11 +62,11 @@ mod tests {
use rusty_fork::rusty_fork_test;
use std::{env};
use pyo3::PyResult;
use crate::logger_setup::{ENABLE_CRT_LOGS_ENV_VAR, setup_logging};
use crate::logger_setup::{S3_TORCH_CONNECTOR_ENABLE_CRT_LOGS_ENV_VAR, S3_TORCH_CONNECTOR_DEBUG_LOGS_ENV_VAR, setup_logging};

fn check_valid_crt_log_level(log_level: &str) {
pyo3::prepare_freethreaded_python();
env::set_var(ENABLE_CRT_LOGS_ENV_VAR, log_level);
env::set_var(S3_TORCH_CONNECTOR_ENABLE_CRT_LOGS_ENV_VAR, log_level);
let result: PyResult<()> = setup_logging();
assert!(result.is_ok());
}
Expand All @@ -73,15 +75,15 @@ mod tests {
#[test]
fn test_crt_environment_variable_unset() {
pyo3::prepare_freethreaded_python();
env::remove_var(ENABLE_CRT_LOGS_ENV_VAR);
env::remove_var(S3_TORCH_CONNECTOR_ENABLE_CRT_LOGS_ENV_VAR);
let result: PyResult<()> = setup_logging();
assert!(result.is_ok());
}

#[test]
fn test_rust_log_environment_variable_unset() {
fn test_debug_log_environment_variable_unset() {
pyo3::prepare_freethreaded_python();
env::remove_var("RUST_LOG");
env::remove_var(S3_TORCH_CONNECTOR_DEBUG_LOGS_ENV_VAR);
let result: PyResult<()> = setup_logging();
assert!(result.is_ok());
}
Expand Down Expand Up @@ -119,16 +121,16 @@ mod tests {
#[test]
fn test_default_logging_level_debug() {
pyo3::prepare_freethreaded_python();
env::set_var("RUST_LOG", "debug");
env::set_var(S3_TORCH_CONNECTOR_DEBUG_LOGS_ENV_VAR, "debug");
let result: PyResult<()> = setup_logging();
assert!(result.is_ok());
}

#[test]
fn test_set_both_logging_levels() {
pyo3::prepare_freethreaded_python();
env::set_var("RUST_LOG", "debug");
env::set_var(ENABLE_CRT_LOGS_ENV_VAR, "info");
env::set_var(S3_TORCH_CONNECTOR_DEBUG_LOGS_ENV_VAR, "debug");
env::set_var(S3_TORCH_CONNECTOR_ENABLE_CRT_LOGS_ENV_VAR, "1");
let result: PyResult<()> = setup_logging();
assert!(result.is_ok());
}
Expand Down

0 comments on commit 70f90ec

Please sign in to comment.