fix(calculate): update dependency ray to v2.38.0 #19696

renovate · 2024-10-16T14:51:11Z

This PR contains the following updates:

Package	Change	Age	Adoption	Passing	Confidence
ray	`2.35.0` -> `2.38.0`

Warning

Some dependencies could not be looked up. Check the Dependency Dashboard for more information.

Release Notes

ray-project/ray (ray)

`v2.38.0`

Compare Source

Ray Libraries

Ray Data

🎉 New Features:

Add Dataset.rename_columns (#47906)
Basic structured logging (#47210)

💫 Enhancements:

Add partitioning parameter to read_parquet (#47553)
Add SERVICE_UNAVAILABLE to list of retried transient errors (#47673)
Re-phrase the streaming executor current usage string (#47515)
Remove ray.kill in ActorPoolMapOperator (#47752)
Simplify and consolidate progress bar outputs (#47692)
Refactor OpRuntimeMetrics to support properties (#47800)
Refactor plan_write_op and Datasinks (#47942)
Link PhysicalOperator to its LogicalOperator (#47986)
Allow specifying both num_cpus and num_gpus for map APIs (#47995)
Allow specifying insertion index when registering custom plan optimization Rules (#48039)
Adding in better framework for substituting logging handlers (#48056)

🔨 Fixes:

Fix bug where Ray Data incorrectly emits progress bar warning (#47680)
Yield remaining results from async map_batches (#47696)
Fix event loop mismatch with async map (#47907)
Make sure num_gpus provide to Ray Data is appropriately passed to ray.remote call (#47768)
Fix unequal partitions when grouping by multiple keys (#47924)
Fix reading multiple parquet files with ragged ndarrays (#47961)
Removing unneeded test case (#48031)
Adding in better json checking in test logging (#48036)
Fix bug with inserting custom optimization rule at index 0 (#48051)
Fix logging output from write_xxx APIs (#48096)

📖 Documentation:

Add docs section for Ray Data progress bars (#47804)
Add reference to parquet predicate pushdown (#47881)
Add tip about how to understand map_batches format (#47394)

Ray Train

🏗 Architecture refactoring:

Remove deprecated mosaic and sklearn trainer code (#47901)

Ray Tune

🔨 Fixes:

Fix WandbLoggerCallback to reuse actors upon restore (#47985)

Ray Serve

🔨 Fixes:

Stop scheduling task early when requests have been canceled (#47847)

RLlib

🎉 New Features:

Enable cloud checkpointing. (#47682)

💫 Enhancements:

PPO on new API stack now shuffles batches properly before each epoch. (#47458)
Other enhancements: #47705, #47501, #47731, #47451, #47830, #47970, #47157

🔨 Fixes:

Fix spot node preemption problem (RLlib now run stably with EnvRunner workers on spot nodes) (#47940)
Fix action masking example. (#47817)
Various other fixes: #47973, #46721, #47914, #47880, #47304, #47686

🏗 Architecture refactoring:

Switch on new API stack by default for SAC and DQN. (#47217)
Remove Tf support on new API stack for PPO/IMPALA/APPO (only DreamerV3 on new API stack remains with tf now). (#47892)
Discontinue support for "hybrid" API stack (using RLModule + Learner, but still on RolloutWorker and Policy) (#46085)
RLModule (new API stack) refinements: #47884, #47885, #47889, #47908, #47915, #47965, #47775

📖 Documentation:

Add new API stack migration guide. (#47779)
New API stack example script: BC pre training, then PPO finetuning using same RLModule class. (#47838)
New API stack: Autoregressive actions example. (#47829)
Remove old API stack connector docs entirely. (#47778)

Ray Core and Ray Clusters

Ray Core

🎉 New Features:

CompiledGraphs: support multi readers in multi node when DAG is created from an actor (#47601)

💫 Enhancements:

Add a flag to raise exception for out of band serialization of ObjectRef (#47544)
Store each GCS table in its own Redis Hash (#46861)
Decouple create worker vs pop worker request. (#47694)
Add metrics for GCS jobs (#47793)

🔨 Fixes:

Fix broken dashboard cluster page when there are dead nodes (#47701)
Fix the ray_tasks{State="PENDING_ARGS_FETCH"} metric counting (#47770)
Separate the attempt_number with the task_status in memory summary and object list (#47818)
Fix object reconstruction hang on arguments pending creation (#47645)
Fix check failure: sync_reactors_.find(reactor->GetRemoteNodeID()) == sync_reactors_.end() (#47861)
Fix check failure RAY_CHECK(it != current_tasks_.end()); (#47659)

📖 Documentation:

KubeRay docs: Add docs for YuniKorn Gang scheduling #47850

Dashboard

💫 Enhancements:

Performance improvements for large scale clusters (#47617)

🔨 Fixes:

Placement group and required resources not showing correctly in dashboard (#47754)

Thanks

Many thanks to all those who contributed to this release!
@GeneDer, @rkooo567, @dayshah, @saihaj, @nikitavemuri, @bill-oconnor-anyscale, @WeichenXu123, @can-anyscale, @jjyao, @edoakes, @kekulai-fredchang, @bveeramani, @alexeykudinkin, @raulchen, @khluu, @sven1977, @ruisearch42, @dentiny, @MengjinYan, @Mark2000, @simonsays1980, @rynewang, @PatricYan, @zcin, @sofianhnaide, @matthewdeng, @dlwh, @scottjlee, @MortalHappiness, @kevin85421, @win5923, @aslonnie, @prithvi081099, @richardsliu, @milesvant, @omatthew98, @Superskyyy, @pcmoritz

`v2.37.0`

Compare Source

Ray Libraries

Ray Data

💫 Enhancements:

Simplify custom metadata provider API (#47575)
Change counts of metrics to rates of metrics (#47236)
Throw exception for non-streaming HF datasets with "override_num_blocks" argument (#47559)
Refactor custom optimizer rules (#47605)

🔨 Fixes:

Remove ineffective retry code in plan_read_op (#47456)
Fix incorrect pending task size if outputs are empty (#47604)

Ray Train

💫 Enhancements:

Update run status and add stack trace to TrainRunInfo (#46875)

Ray Serve

💫 Enhancements:

Allow control of some serve configuration via env vars (#47533)
[serve] Faster detection of dead replicas (#47237)

🔨 Fixes:

[Serve] fix component id logging field (#47609)

RLlib

💫 Enhancements:

New API stack:
- Add restart-failed-env option to EnvRunners. (#47608 )
- Offline RL: Store episodes in state form. (#47294 )
- Offline RL: Replace GAE in MARWILOfflinePreLearner with GeneralAdvantageEstimation connector in learner pipeline. (#47532)
- Off-policy algos: Add episode sampling to EpisodeReplayBuffer. (#47500)
- RLModule APIs: Add SelfSupervisedLossAPI for RLModules that bring their own loss and InferenceOnlyAPI. (#47581, #47572)

Ray Core

💫 Enhancements:

[aDAG] Allow custom NCCL group for aDAG (#47141)
[aDAG] support buffered input (#47272)
[aDAG] Support multi node multi reader (#47480)
[Core] Make is_gpu, is_actor, root_detached_id fields late bind to workers. (#47212)
[Core] Reconstruct actor to run lineage reconstruction triggered actor task (#47396)
[Core] Optimize GetAllJobInfo API for performance (#47530)

🔨 Fixes:

[aDAG] Fix ranks ordering for custom NCCL group (#47594)

Ray Clusters

📖 Documentation:

[KubeRay] add a guide for deploying vLLM with RayService (#47038)

Thanks

Many thanks to all those who contributed to this release!
@ruisearch42, @andrewsykim, @timkpaine, @rkooo567, @WeichenXu123, @GeneDer, @sword865, @simonsays1980, @angelinalg, @sven1977, @jjyao, @woshiyyya, @aslonnie, @zcin, @omatthew98, @rueian, @khluu, @justinvyu, @bveeramani, @nikitavemuri, @chris-ray-zhang, @liuxsh9, @xingyu-long, @peytondmurray, @rynewang

`v2.36.1`

Compare Source

Ray Core

🔨 Fixes:

Fix broken dashboard cluster page when there are dead nodes (#47701)
Fix broken dashboard worker page (#47714)

`v2.36.0`

Compare Source

Ray Libraries

Ray Data

💫 Enhancements:

Remove limit on number of tasks launched per scheduling step (#47393)
Allow user-defined Exception to be caught. (#47339)

🔨 Fixes:

Display pending actors separately in the progress bar and not count them towards running resources (#46384)
Fix bug where arrow_parquet_args aren't used (#47161)
Skip empty JSON files in read_json() (#47378)
Remove remote call for initializing Datasource in read_datasource() (#47467)
Remove dead from_*_operator modules (#47457)
Release test fixes
Add AWS ACCESS_DENIED as retryable exception for multi-node Data+Train benchmarks (#47232)
Get AWS credentials with boto (#47352)
Use worker node instead of head node for read_images_comparison_microbenchmark_single_node release test (#47228)

📖 Documentation:

Add docstring to explain Dataset.deserialize_lineage (#47203)
Add a comment explaining the bundling behavior for map_batches with default batch_size (#47433)

Ray Train

💫 Enhancements:

Decouple device-related modules and add Huawei NPU support to Ray Train (#44086)

🔨 Fixes:

Update TORCH_NCCL_ASYNC_ERROR_HANDLING env var (#47292)

📖 Documentation:

Add missing Train public API reference (#47134)

Ray Tune

📖 Documentation:

Add missing Tune public API references (#47138)

Ray Serve

💫 Enhancements:

Mark proxy as unready when its routers are aware of zero replicas (#47002)
Setup default serve logger (#47229)

🔨 Fixes:

Allow get_serve_logs_dir to run outside of Ray's context (#47224)
Use serve logger name for logs in serve (#47205)

📖 Documentation:

[HPU] [Serve] [experimental] Add vllm HPU support in vllm example (#45893)

🏗 Architecture refactoring:

Remove support for nested DeploymentResponses (#47209)

RLlib

🎉 New Features:

New API stack: Add CQL algorithm. (#47000, #47402)
New API stack: Enable GPU and multi-GPU support for DQN/SAC/CQL. (#47179)

💫 Enhancements:

New API stack: Offline RL enhancements: #47195, #47359
Enhance new API stack stability: #46324, #47196, #47245, #47279
Fix large batch size for synchronous algos (e.g. PPO) after EnvRunner failures. (#47356)
Add torch.compile config options to old API stack. (#47340 )
Add kwargs to torch.nn.parallel.DistributedDataParallel (#47276)
Enhanced CI stability: #47197, #47249

📖 Documentation:

New API stack example scripts:
- Float16 training example script. (#47362)
- Mixed precision training example script (#47116)
- ModelV2 -> RLModule wrapper for migrating to new API stack. (#47425)
Remove "new API stack experimental" hint from docs. (#47301)

🏗 Architecture refactoring:

Remove 2nd Learner ConnectorV2 pass from PPO (#47401)
Add separate learning rates for policy and alpha to SAC. (#47078)

🔨 Fixes:

Various bug fixes: #47401, #47194, #47259, #47271, #47277, #47382

Ray Core

💫 Enhancements:

[ADAG] Raise proper error message for nccl within the same actor (#47250)
[ADAG] Support multi-read of the same shm channel (#47311 )
Log why core worker is not idle during HandleExit (#47300 )
Add PREPARED state for placement groups in GCS for better fault tolerance. (#46858)

🔨 Fixes:

Fix ray_unintentional_worker_failures_total to only count unintentional worker failures (#47368)
Fix runtime env race condition when uploading the same package concurrently (#47482)

Dashboard

🔨 Fixes:

Performance optimizations for dashboard backend logic (#47392) (#47367) (#47160) (#47213)
Refactor to simplify dashboard backend logic (#47324)

Docs

💫 Enhancements:

Add sphinx-autobuild and documentation for make local (#47275): Speed up of local docs builds with make local.
Add Algolia search to docs (#46477)
Update PyTorch Mnist Training doc for KubeRay 1.2.0 (#47321)
Life-cycle of documentation policy of Ray APIs

Thanks

Many thanks to all those who contributed to this release!
@GeneDer, @Bye-legumes, @nikitavemuri, @kevin85421, @MortalHappiness, @LeoLiao123, @saihaj, @rmcsqrd, @bveeramani, @zcin, @matthewdeng, @raulchen, @mattip, @jjyao, @ruisearch42, @scottjlee, @can-anyscale, @khluu, @aslonnie, @rynewang, @edoakes, @zhanluxianshen, @venkatram-dev, @c21, @allenyin55, @alexeykudinkin, @snehakottapalli, @BitPhinix, @hongchaodeng, @dengwxn, @liuxsh9, @simonsays1980, @peytondmurray, @KepingYan, @bryant1410, @woshiyyya, @sven1977

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.

If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

renovate · 2024-10-16T14:51:12Z

⚠️ Artifact update problem

Renovate failed to update an artifact related to this branch. You probably do not want to merge this PR as-is.

♻ Renovate will retry this branch, including artifacts, only when one of the following happens:

any of the package files in this branch needs updating, or
the branch becomes conflicted, or
you click the rebase/retry checkbox if found above, or
you rename this PR's title to start with "rebase!" to trigger it manually

The artifact failure details are included below:

File name: cloud-computing/hm-ray/applications/calculate/poetry.lock

Updating dependencies
Resolving dependencies...

Creating virtualenv non-package-mode in /tmp/renovate/repos/github/hongbo-miao/hongbomiao.com/cloud-computing/hm-ray/applications/calculate/.venv

The current project's supported Python range (==3.9.*) is not compatible with some of the required packages Python requirement:
  - ray requires Python >=3.9, so it will not be satisfied for Python >=3.9.dev0,<3.9

Because non-package-mode depends on ray[default] (2.38.0) which requires Python >=3.9, version solving failed.

  • Check your dependencies Python requirement: The Python requirement can be specified via the `python` or `markers` properties
    
    For ray, a possible solution would be to set the `python` property to ">=3.9,<3.10.dev0"

    https://python-poetry.org/docs/dependency-specification/#python-restricted-dependencies,
    https://python-poetry.org/docs/dependency-specification/#using-environment-markers

sonarqubecloud · 2024-10-26T02:19:26Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

github-actions · 2024-10-28T13:16:38Z

🎉 This PR is included in version 1.122.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

renovate bot temporarily deployed to test October 16, 2024 14:51 Inactive

renovate bot had a problem deploying to test October 16, 2024 15:26 Error

renovate bot force-pushed the renovate/calculate-ray-2.x branch from 3063fd2 to 8aff05b Compare October 16, 2024 15:38

renovate bot had a problem deploying to test October 16, 2024 15:38 Error

hongbo-miao force-pushed the renovate/calculate-ray-2.x branch from 8830fa1 to d759fc3 Compare October 26, 2024 02:17

hongbo-miao temporarily deployed to test October 26, 2024 02:17 — with GitHub Actions Inactive

hongbo-miao temporarily deployed to test October 26, 2024 02:18 — with GitHub Actions Inactive

mergify bot approved these changes Oct 26, 2024

View reviewed changes

mergify bot merged commit 60a572f into main Oct 26, 2024
135 checks passed

mergify bot deleted the renovate/calculate-ray-2.x branch October 26, 2024 02:20

github-actions bot added the released label Oct 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(calculate): update dependency ray to v2.38.0 #19696

fix(calculate): update dependency ray to v2.38.0 #19696

renovate bot commented Oct 16, 2024 •

edited

Loading

renovate bot commented Oct 16, 2024 •

edited

Loading

sonarqubecloud bot commented Oct 26, 2024

github-actions bot commented Oct 28, 2024

fix(calculate): update dependency ray to v2.38.0 #19696

fix(calculate): update dependency ray to v2.38.0 #19696

Conversation

renovate bot commented Oct 16, 2024 • edited Loading

Release Notes

v2.38.0

Ray Libraries

Ray Data

Ray Train

Ray Tune

Ray Serve

RLlib

Ray Core and Ray Clusters

Ray Core

Dashboard

Thanks

v2.37.0

Ray Libraries

Ray Data

Ray Train

Ray Serve

RLlib

Ray Core

Ray Clusters

Thanks

v2.36.1

Ray Core

v2.36.0

Ray Libraries

Ray Data

Ray Train

Ray Tune

Ray Serve

RLlib

Ray Core

Dashboard

Docs

Thanks

Configuration

renovate bot commented Oct 16, 2024 • edited Loading

⚠️ Artifact update problem

File name: cloud-computing/hm-ray/applications/calculate/poetry.lock

sonarqubecloud bot commented Oct 26, 2024

Quality Gate passed

github-actions bot commented Oct 28, 2024

renovate bot commented Oct 16, 2024 •

edited

Loading

`v2.38.0`

`v2.37.0`

`v2.36.1`

`v2.36.0`

renovate bot commented Oct 16, 2024 •

edited

Loading