Skip to content

Commit

Permalink
Add troubleshooting section on regenerating CAs
Browse files Browse the repository at this point in the history
This describes the "offline" version of the process. While this can be
done in multiple passes with less downtime but much more complexity and
work, let's start with the supposedly "simplest" alternative.

Signed-off-by: Tom Wieczorek <[email protected]>
  • Loading branch information
twz123 committed Nov 29, 2024
1 parent f8f89a7 commit 8924155
Show file tree
Hide file tree
Showing 5 changed files with 112 additions and 2 deletions.
4 changes: 4 additions & 0 deletions docs/custom-ca.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,7 @@ Here's an example of a command for pre-generating a token for a controller.
```shell
k0s token pre-shared --role controller --cert /var/lib/k0s/pki/ca.crt --url https://<controller-ip>:9443/
```

## See also

- [Certificate Authorities](troubleshooting/certificate-authorities.md)
4 changes: 2 additions & 2 deletions docs/k0s-multi-node.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ To get a token, run the following command on one of the existing controller node
sudo k0s token create --role=worker
```

The resulting output is a long [token](#about-tokens) string, which you can use to add a worker to the cluster.
The resulting output is a long [token](#about-join-tokens) string, which you can use to add a worker to the cluster.

For enhanced security, run the following command to set an expiration time for the token:

Expand All @@ -84,7 +84,7 @@ sudo k0s install worker --token-file /path/to/token/file
sudo k0s start
```

#### About tokens
#### About join tokens

The join tokens are base64-encoded [kubeconfigs](https://kubernetes.io/docs/tasks/access-application-cluster/configure-access-multiple-clusters/) for several reasons:

Expand Down
9 changes: 9 additions & 0 deletions docs/troubleshooting/FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,3 +31,12 @@ As a default, the control plane does not run kubelet at all, and will not accept
## Is k0sproject really open source?

Yes, k0sproject is 100% open source. The source code is under Apache 2 and the documentation is under the Creative Commons License. Mirantis, Inc. is the main contributor and sponsor for this OSS project: building all the binaries from upstream, performing necessary security scans and calculating checksums so that it's easy and safe to use. The use of these ready-made binaries are subject to Mirantis EULA and the binaries include only open source software.

## A kubeconfig created via [`k0s kubeconfig`](../cli/k0s_kubeconfig.md) has been leaked, what can I do?

Kubernetes does not support certificate revocation (see [k/k/18982]). This means
that you cannot disable the leaked credentials. The only way to effectively
revoke them is to [replace the Kubernetes CA] for your cluster.

[k/k/18982]: https://github.com/kubernetes/kubernetes/issues/18982
[replace the Kubernetes CA]: certificate-authorities.md#replacing-the-kubernetes-ca-and-sa-key-pair
96 changes: 96 additions & 0 deletions docs/troubleshooting/certificate-authorities.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Certificate Authorities (CAs)

## Overview of CAs managed by k0s

k0s maintains two Certificate Authorities and one public/private key pair:

* The **Kubernetes CA** is used to secure the Kubernetes cluster and manage
client and server certificates for API communication.
* The **etcd CA** is used only when managed etcd is enabled, for securing etcd
communications.
* The **Kubernetes Service Account (SA) key pair** is used for signing
Kubernetes [service account tokens].

These CAs are automatically created during cluster initialization and have a
default expiration period of 10 years. They are distributed once to all k0s
controllers as part of k0s's [join process]. Replacing them is a manual process,
as k0s currently lacks automation for CA renewal.

[service account tokens]: https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/
[join process]: ../k0s-multi-node.md#5-add-controllers-to-the-cluster

## Replacing the Kubernetes CA and SA key pair

The following steps describe a way how to manually replace the Kubernetes CA and
SA key pair by taking a cluster down, regenerating those and redistributing them
to all nodes, and then bringing the cluster back online:

1. Take a [backup]! Things might go wrong at any level.

2. Stop k0s on all worker and controller nodes. All the instructions below
assume that all k0s nodes are using the default data directory
`/var/lib/k0s`. Please adjust accordingly if you're using a different data
directory path.

3. Delete the Kubernetes CA and SA key pair files from the all the controller
data directories:

* `/var/lib/k0s/pki/ca.crt`
* `/var/lib/k0s/pki/ca.key`
* `/var/lib/k0s/pki/sa.pub`
* `/var/lib/k0s/pki/sa.key`

Delete the kubelet's kubeconfig file and the kubelet's PKI directory from all
worker data directories. Note that this includes controllers that have been
started with the `--enable-worker` flag:

* `/var/lib/k0s/kubelet.conf`
* `/var/lib/k0s/kubelet/pki`

4. Choose one controller as the "first" one. Restart k0s on the first
controller. If this controller is running with the `--enable-worker` flag,
you should **reboot the machine** instead. This will ensure that all
processes and pods will be cleanly restarted. After the restart, k0s will
have regenerated a new Kubernetes CA and SA key pair.

5. Distribute the new CA and SA key pair to the other controllers: Copy over the
following files from the first controller to each of the remaining
controllers:

* `/var/lib/k0s/pki/ca.crt`
* `/var/lib/k0s/pki/ca.key`
* `/var/lib/k0s/pki/sa.pub`
* `/var/lib/k0s/pki/sa.key`

After copying the files, the new CA and SA key pair are in place. Restart k0s
on the other controllers. For controllers running with the `--enable-worker`
flag, **reboot the machines** instead.

6. Rejoin all workers. The easiest way to do this is to use a
`kubelet-bootstrap.conf` file. You can [generate](../cli/k0s_token_create.md)
such a file on a controller like this (see the section on [join tokens] for
details):

```sh
touch /tmp/rejoin-token &&
chmod 0600 /tmp/rejoin-token &&
k0s token create --expiry 1h |
base64 -d |
gunzip >/tmp/rejoin-token
```

Copy that token to each worker node and place it at
`/var/lib/k0s/kubelet-bootstrap.conf`. Then reboot the machine.

7. When all workers are back online, the `kubelet-bootstrap.conf` files can be
safely removed from the workers. You can also invalidate the token so you
don't have to wait for it to expire: Use [`k0s token list --role
worker`](../cli/k0s_token_list.md) to list all tokens and [`k0s token
invalidate <token-id>`](../cli/k0s_token_invalidate.md) to invalidate them immediately.

[backup]: ../backup.md
[join tokens]: ../k0s-multi-node.md#about-join-tokens

## See also

* [Install using custom CAs](../custom-ca.md)
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ nav:
- Logs: troubleshooting/logs.md
- Common Pitfalls: troubleshooting/troubleshooting.md
- Support Insights: troubleshooting/support-dump.md
- Certificate Authorities (CAs): troubleshooting/certificate-authorities.md
- Reference:
- Architecture: architecture/index.md
- Command Line: cli/README.md
Expand Down

0 comments on commit 8924155

Please sign in to comment.