-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bitnami/etcd] Stop relying on files for state #75906
base: main
Are you sure you want to change the base?
Conversation
- Remove prestop logic (no longer removing member when container stops) - Remove members not included in ETCD_INITIAL_CLUSTERS during startup - Stop storing member id on a separate file, member id is checked from etcd data dir instead - Stop reading member removal state off of disk, probe the cluster instead - Remove old member (with the same name) if exist before adding new member - If data dir is not empty, check if the member still belongs to the cluster. If not, remove data dir, remove member with the same name, and add new member - Remove env var ETCD_DISABLE_STORE_MEMBER_ID - Remove env var ETCD_DISABLE_PRESTOP Signed-off-by: Khoi Pham <[email protected]>
Signed-off-by: Khoi Pham <[email protected]>
Signed-off-by: Khoi Pham <[email protected]>
Signed-off-by: Khoi Pham <[email protected]>
Signed-off-by: Khoi Pham <[email protected]>
…s new Signed-off-by: Khoi Pham <[email protected]>
I'm planning to open a complementary PR in the charts repo. I will try to add more tests there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @pckhoi
Thanks so much for this amazing contribution! It'd definitely help on making the Bitnami etcd chart more stable.
I think the main concern/challenge with your changes would be providing a solution for users who may scale down the cluster via kubectl scale sts/etcd --replicas X
(or via some HorizontalPodAutoscaler that may also scale down the cluster without Helm's control via hooks). Correct me if I'm wrong but this use case won't be covered, right?
bitnami/etcd/3.5/debian-12/rootfs/opt/bitnami/scripts/etcd/preupgrade.sh
Outdated
Show resolved
Hide resolved
bitnami/etcd/3.5/debian-12/rootfs/opt/bitnami/scripts/etcd/preupgrade.sh
Outdated
Show resolved
Hide resolved
bitnami/etcd/3.5/debian-12/rootfs/opt/bitnami/scripts/libetcd.sh
Outdated
Show resolved
Hide resolved
@juan131 you're correct that the autoscaling use case isn't covered. People use Etcd for its consistency rather than for handling large, fluctuating traffic so I think autoscaling to handle large traffic is a niche use case. As for manual scaling, running |
Thanks for confirming so @pckhoi ! In that case, I'd add a warning at the "Upgrading" section alerting about what these changes imply (I mean, warning users to use exclusively Helm to scale the cluster): We could even add it in the chart NOTES: |
Sure, I will do that. |
Signed-off-by: Khoi Pham <[email protected]>
Signed-off-by: Khoi Pham <[email protected]>
Signed-off-by: Khoi Pham <[email protected]>
@juan131 I have updated https://github.com/bitnami/charts/tree/main/bitnami/etcd#upgrading. As for https://github.com/bitnami/charts/blob/main/bitnami/etcd/templates/NOTES.txt, I don't see anything that needs to be updated. |
bitnami/etcd/3.5/debian-12/rootfs/opt/bitnami/scripts/etcd/preupgrade.sh
Outdated
Show resolved
Hide resolved
bitnami/etcd/3.5/debian-12/rootfs/opt/bitnami/scripts/etcd/preupgrade.sh
Outdated
Show resolved
Hide resolved
bitnami/etcd/3.5/debian-12/rootfs/opt/bitnami/scripts/libetcd.sh
Outdated
Show resolved
Hide resolved
Signed-off-by: Khoi Pham <[email protected]>
Thanks! I have addressed all the suggestsions. |
@pckhoi I think this PR looks great now! Could you please check my comments in the associated chart PR? Thanks in advance. |
Signed-off-by: Khoi Pham <[email protected]>
fi | ||
info "Current cluster members are: $(echo "$current" | awk -F: '{print $1}' | tr -s '\n' ',' | sed 's/,$//g')" | ||
expected="$(echo $ETCD_INITIAL_CLUSTER | tr -s ',' '\n' | awk -F= '{print $1}')" | ||
info "Expected cluster members are: $(echo "$expected" | tr -s '\n' ',' | sed 's/,$//g')" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While testing, I noticed that while increasing the number of replicas, the logs look like this:
INFO ==> Current cluster members are: etcd-0
INFO ==> Expected cluster members are: etcd-0,etcd-1,etcd-2
I'd include some message when we detect expected cluster member are greater or equal to the current ones. Sth like:
info "No obsolete members to remove. Pre-upgrade checks completed!"
Similar to this, while scaling down the logs look like this:
INFO ==> Current cluster members are: etcd-0
INFO ==> Expected cluster members are: etcd-0,etcd-1,etcd-2
I'd also add some extra messages so the logs are sth like:
INFO ==> Current cluster members are: etcd-0,etcd-1,etcd-2
INFO ==> Expected cluster members are: etcd-0,etcd-1
INFO ==> Removing obsolete member etcd-2
Member 9fa135f920e34edd removed from cluster e111ddc420dc1d22
info "Pre-upgrade checks completed!"
if ! current="$(etcdctl member list ${extra_flags[@]} --write-out simple | awk -F ", " '{print $3 ":" $1}')"; then | ||
debug "Error listing members, is this a new cluster?" | ||
exit 0 | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add some "retry" mechanism so we gave up on listing members after 3 consecutive failures or sth like that (check retry_while
examples in the libetcd.sh
library). This could help making the script more reliable.
Description of the change
The current etcd container and chart have a few major problems:
replicas
update then the next time the pod starts, it will not be able to start from the existing data dir which means it must throw away the data dir and start from scratch.etcdctl member update
for unclear reasons when the data dir is not empty and there is a member IDETCD_INITIAL_CLUSTER_STATE
to know whether the cluster is new which could be inaccurateThis PR add the following changes:
preupgrade.sh
which should be run in a Helm pre-upgrade hook. When the cluster is scaled down, it detects and removes obsolete members withetcdctl member remove
.prestop.sh
member_id
file. Instead, the remote member ID is read from the cluster withetcdctl member list
, and the local member ID is checked for conflict during startup.member_removal.log
. Check withetcdctl member list
instead.ETCD_DISABLE_STORE_MEMBER_ID
ETCD_DISABLE_PRESTOP
ETCD_INITIAL_CLUSTER_STATE
becomes read-onlyBenefits
etcdctl member remove
command tends to be executed against a healthy clusterPossible drawbacks
Applicable issues
Additional information
Related changes in the Helm chart: bitnami/charts#31161 and bitnami/charts#31164