-
Notifications
You must be signed in to change notification settings - Fork 66
Workspaces cannot be created after OpenShift v3.7.9 update #1666
Comments
+1. Blocks user from doing anything on the workspace. |
@rhopp anything suspicious in the openshift events ? |
@ibuziuk Nothing. No events at all. |
The very same problem is reproducible on prod-preview
|
Maybe it can help somehow. Workspaces on Che 6 Server start successfully. Here you can try it http://che-che6-server.dev.rdu2c.fabric8.io |
The difference I see in the deployment yaml: - apiVersion: extensions/v1beta1
+ apiVersion: apps/v1beta1
kind: Deployment |
@l0rd Deployments are new in OpenShift 3.7. However not all nodes are upgraded to 3.7 at the moment so I'm not sure how this is handled by the controllers. Since this issue was opened, a few nodes have been upgraded to 3.7 and the upgrade process is still ongoing. I was able to observe that the deployment object is created properly but pods are not created. No events in OpenShift. Notice that replicas is set to 0.
|
@ibuziuk free-int was also running v3.7.9 (the latest GA) |
@ibuziuk FWIW free-int was (before 3.8 upgrade): OpenShift Master: 3.7.9 (online version 3.6.0.83) |
@jfchevrette @mmclanerh could we update
Current version of online looks different - #1666 (comment) |
I am being told that a very similar issue with deployments was observed by the openshift team in their jenkins talking to OpenShift 3.7. The issue was resolved by upgrading the fabric8 kubernetes client in their jenkins plugin. Is this something we can try? |
Upstream bug https://bugzilla.redhat.com/show_bug.cgi?id=1526165 |
This issue was 'resolved' by restarting the controllers on the master nodes. Upstream is still investigating the root cause. |
@mmclanerh awesome news! Since it is not fixed on prod-preview I would still keep it open and add [prod-preview] to the description + change severity |
controllers restarted in prod-preview. this should alleviate the pod replicas issues. |
My account on prod-preview is currently broken, so I have no way how to verify. |
openshift online cluster upgrades are ongoing now, need to wait till it finished and retry. |
Seeing this error again today: Could not start workspace a6lex. Reason: Start of environment 'default' failed. Error: null |
Upstream bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=1526165 In order to reproduce the problem on prod / prod-preview cluster the sample [1] provided by @jfchevrette was used. After applying yml on prod / prod-preview clusters deployment is never scaled up. BTW, the problem is not reproducible of free-int cluster which was previously used for prod-preview :
[1] https://gist.github.com/jfchevrette/2833c0fb2f685f4eaf221f681dfc755b |
production is now working for pod spin ups |
since workspace creation is working fine on prod after controllers restart changing to SEV2 |
if everything satisfactory, i believe this issue can be closed out. |
Aaaand it's back. |
The OpenShift controllers have been restarted to workaround this issue. Upstream has a fix openshift/origin#17855 which is making it's way to our cluster soon. I unfortunately don't have an ETA yet. |
the issue is back again on prod changing label to SEV1 |
Controllers restarted and deplloyments work. For future reference the service is atomic-openshift-master-controllers.service. Hotfix is applied to free-stg, but does not yet look to be on starter clusters - my impression is that this should hit starter-us-east-2 within a few working days. |
Closing. Fix for prod cluster is applied / bugzilla[1] for starter-us-east-2 is has "verified" status |
This is affecting multi-tenant and single-tenant che.

When creating workspace, deployment gets created, but that's all...
Log from che-master pod:
The text was updated successfully, but these errors were encountered: