-
Notifications
You must be signed in to change notification settings - Fork 282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[rke2-windows] Windows node in NotReady
state after it joins the cluster
#7793
Comments
You've not included any output showing the node status. Can you provide the yaml and/or described node? Kubelet, containerd, and CNI logs may also be useful. |
Cluster statusDeployments: Windows deployment is pending likely due to node in NotReady state
Here are the logs CNI pod log
nodes.txt
|
Is this happening with both CNIs that we support on Windows, or only flannel? Can you also grab the containerd config.toml? I suspect something is going on with the CNI bin dir setting in the updated template. |
In #7771 (linked to this one) the Windows Node is in NotReady state was based on Calico CNI and RKE2 v1.31.6-rc1+rke2r1. |
Can you get calico's and flannel's log? They are in |
Its happening for both the CNIs CC @brandond Here is the config.toml Latest rc
Previous release
|
Here are the logs from Windows node CC @manuelbuil
calico-node.log
|
I see the new rc does not have below in the
Could that be the issue @brandond?
|
Apparently Linux RKE2 nodes use the default CNI bin path (it is unset in the config), but on Windows it is set to |
Just to add some information, the logs seem correct, so the network infrastructure should be well created. It is likely that the node can't find the cni binary as you guys are already discovering |
Thanks @brandond for identifying the root cause of the issue, Would you be able to share an estimated timeline for when the fix might be available? |
Before final release. |
Environmental Info:
RKE2 Version:
v1.29.14-rc1 and all the latest RCs (v1.30.10, v1.31.6, v1.32.2)
Node(s) CPU architecture, OS, and Version:
Cluster Configuration:
Describe the bug:
After Windows agent joins the cluster it remains in
NotReady
state. Observed on v1.29 after this commitc3050110de27bb3463ece3117ce6fa5509d89b73
and the latest RCs. Worked fine up until this commit a25f441. Most likely started happening after the k3s pull throughSteps To Reproduce:
Expected behavior:
Ready
stateActual behavior:
NotReady
stateAdditional context / logs:
Nothing much on the server logs, Observed below when rke2 service is run with debug mode
The text was updated successfully, but these errors were encountered: