Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Allow talosctl wipe to work with --insecure #10011

Open
lenaxia opened this issue Dec 20, 2024 · 0 comments
Open

[Feature] Allow talosctl wipe to work with --insecure #10011

lenaxia opened this issue Dec 20, 2024 · 0 comments

Comments

@lenaxia
Copy link

lenaxia commented Dec 20, 2024

Feature Request

There is currently no way to fully wipe a node and return it to a completely fresh install. When a node is put in maintenance mode, it cannot be wiped because the wipe command does not work with --insecure

$ talosctl -n 192.168.0.123 -e 192.168.0.123 wipe disk sda
rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: tls: failed to verify certificate: x509: certificate signed by unknown authority"
$ talosctl -n 192.168.0.123 -e 192.168.0.123 wipe disk sda --insecure
unknown flag: --insecure

I argue that a full wipe does not constitute a violation of the immutability principle.

Alternatively, a boot menu option that allows booting from media (PXE or USB) and ignoring the on-disk install would also be acceptable, this way the on disk install could be replaced. This would be in addition to the option of resetting the on-disk install to maintenance mode.

Description

Right now there is no way to wipe a node back to a fully fresh Talos install without booting to a livedisk for a different distro and manually wiping the disk. If a node gets into a state where a machine config cannot be completely applied, the node becomes completely unusable with no way for recovery.

Example 1

  1. Apply the following configuration using factory.talos.dev/installer/c19375fb8749831132a7d364bf66693aa33bf7ffd7244e4d63258617edd593d0:v1.8.4
customization:
    systemExtensions:
        officialExtensions:
            - siderolabs/i915-ucode
            - siderolabs/intel-ucode
            - siderolabs/nvidia-container-toolkit-production
            - siderolabs/nvidia-open-gpu-kernel-modules-production
            - siderolabs/thunderbolt
            - siderolabs/util-linux-tools
            - siderolabs/zfs
  1. Reboot the node and reset to maintenance mode.

  2. Apply a configuration with a base image and no customizations factory.talos.dev/installer/376567988ad370138ad8b2698212367b8edcb69b5fd68c80be1f2ec7d603b4ba:v1.8.4

customization: {}

If you have no GPU installed, ext-nvidia-persistenced will fail to start and the node will not complete installation (see image below)
image

Resetting, or installing other images does not work to wipe the system extensions. This node is now completely unusable and cannot be recovered without resorting to manual intervention (e.g. PXE boot an ubuntu/arch live disk to wipe /dev/sda).

@lenaxia lenaxia changed the title Allow talosctl wipe to work with insecure Allow talosctl wipe to work with --insecure Dec 20, 2024
@lenaxia lenaxia changed the title Allow talosctl wipe to work with --insecure [Feature] Allow talosctl wipe to work with --insecure Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant