-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Daemon errors with (HTTP code 404) -- no such container: sandbox
#261
Comments
[cywang117] This issue has attached support thread https://jel.ly.fish/72633746-3415-449a-9617-e123cba1e954 |
[cywang117] This issue has attached support thread https://jel.ly.fish/e7428359-c335-4d00-81db-dfb4293d1423 |
The fact that stopping the Supervisor, removing the containers, and starting the Supervisor fixes the issue seems to indicate that this is a Supervisor issue and not a balenaEngine issue. I'll move this to the Supervisor repo |
So it seems that just restarting the Supervisor without removing containers does not fix this issue. However, restarting balenaEngine fixes the issue. Now I'm unclear whether this is Supervisor related or balenaEngine related. I'm leaning towards this being related to balenaEngine having bad state for one of the containers on the device, as a Supervisor restart didn't do anything. |
[cywang117] This issue has attached support thread https://jel.ly.fish/661c8c96-8357-4bfc-9380-308a65fff910 |
[danthegoodman1] This issue has attached support thread https://jel.ly.fish/a4f6be4b-50dc-454d-9c5c-dbcf168119db |
@lmbarros @robertgzr Drawing your attention to some edits I made to this GitHub issue:
Are there any other questions that you think would be useful in investigating the causes behind this issue? Could this kind of problem be something that is unavoidable based on current implementation limitations in dependencies (Moby)? |
[pipex] This issue has attached support thread https://jel.ly.fish/dc8d2638-ebb4-4ba8-8ae6-edae48602850 |
[pipex] This issue has attached support thread https://jel.ly.fish/e82fe388-3955-4252-97c4-6c837151cce2 |
[pipex] This issue has attached support thread https://jel.ly.fish/b7fa70df-ad99-4deb-8f6a-2b78d2f47a44 |
Some extra information for this ticket, this has been reported to be happening more with containers that don't get updated as frequently as others. So a container that has been renamed a few times while others have been recreated may sometimes get into this state For instance, for a particular device, the failing container shows a network prefix of 16
While other veth networks have a much larger prefix, confirming that this is an old network.
Could this issue be an unintended side effect of some cleanup process? |
[gantonayde] This issue has attached support thread https://jel.ly.fish/1b57a2f7-e2b2-4658-94ef-0a35bef04f4b |
[pipex] This issue has attached support thread https://jel.ly.fish/bf30fa84-cc92-4cf8-aefd-4c2f14c4a944 |
[nitish] This issue has attached support thread https://jel.ly.fish/9f4bc524-e6d5-4480-98a5-4d2cefba84f3 |
Did this error appear after a release update? Yep Happened on a new device with just the second release I pushed on it, running a minimal server application (200 mb image, 2 stage build process). Error is below:
Attaching diagnostics File: a01a83846e174aa51dc2b33fbf0a17e7_diagnostics_2022.06.02_20.56.19+0000.txt Adding outputs of commands
FD: https://www.flowdock.com/app/rulemotion/r-supervisor/threads/FQqETXXQaGFg1oLyWz7ccNbPgAx |
[lmbarros] This has attached https://jel.ly.fish/88b86997-9411-40b9-ae2f-8f3505febb93 |
[pipex] This has attached https://jel.ly.fish/c09369f0-c870-4f93-9133-0ec8b995fda9 |
The `updateMetadata` step renames the container to match the target release when the service doesn't change between releases. We have seen this step fail because of an engine bug that seems to relate to the engine keeping stale references after container restarts. The only way around this issue is to remove the old container and create it again. This implements that workaround during the updateMetadata step to deal with that issue. Change-type: minor Relates-to: balena-os/balena-engine#261
The `updateMetadata` step renames the container to match the target release when the service doesn't change between releases. We have seen this step fail because of an engine bug that seems to relate to the engine keeping stale references after container restarts. The only way around this issue is to remove the old container and create it again. This implements that workaround during the updateMetadata step to deal with that issue. Change-type: minor Relates-to: balena-os/balena-engine#261
The `updateMetadata` step renames the container to match the target release when the service doesn't change between releases. We have seen this step fail because of an engine bug that seems to relate to the engine keeping stale references after container restarts. The only way around this issue is to remove the old container and create it again. This implements that workaround during the updateMetadata step to deal with that issue. Change-type: minor Relates-to: balena-os/balena-engine#261
NOTE: For users and support agents arriving here in the future: since it's not clear how we can reproduce this issue, please find out more information about various conditions on the device. Some good starting questions and things to check:
Asking the user if they wouldn't mind leaving the device in this invalid state for engineers to investigate would also help, if the user is okay with this of course.
Description
balenaEngine daemon errors with (HTTP code 404) -- no such container: sandbox . However, there is no
sandbox
container on the device. This error is communicated by the device Supervisor from the journal logs with:Device state apply error Error: Failed to apply state transition steps. (HTTP code 404) no such container - sandbox 915c9f1f78712e9db8bb1edf3d94fd669a917c608270f4c95e3a8c72de142b15 not found Steps:["updateMetadata"]
Per https://github.com/balena-io/balena-io/issues/1684, this might be due to a bad internal state with one of the containers on the device. The issue is fixed by restarting balenaEngine with
systemctl restart balena
ORsystemctl stop balena-supervisor && balena stop $(balena ps -a -q) && balena rm $(balena ps -a -q) && systemctl start balena-supervisor
, however this is not ideal as the containers experience a few minutes of downtime.It's unclear how to reproduce this issue.
Additional information you deem important (e.g. issue happens only occasionally):
Issue happens when a new update is downloaded by the device. Has sometimes appeared in combination with #1579, making cause unclear.
Additional environment details (device type, OS, etc.):
Device Type: Raspberry Pi 4 64bit, 2GB RAM
OS: balenaOS 2.80.3+rev1.prod
The text was updated successfully, but these errors were encountered: