Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manual deployment with ImageStream fails (sometimes) #16728

Closed
rhuss opened this issue Oct 6, 2017 · 10 comments
Closed

Manual deployment with ImageStream fails (sometimes) #16728

rhuss opened this issue Oct 6, 2017 · 10 comments
Assignees
Labels
component/apps kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P2

Comments

@rhuss
Copy link
Contributor

rhuss commented Oct 6, 2017

When using a manual deployment with oc rollout latest ... on a DC which has not been initialized (e.g. with its container's spec pointing to an empty image: ' ' and having image change triggers).

This happens in the following situations:

  • When having auto set to false or true (does not have any effect), it happens always when the manual deployment is first
  • Always when the manual deployment kicks in before the trigger deployment
Version
$ oc version
oc v1.5.0+031cbe4
kubernetes v1.5.2+43a9be4
features: Basic-Auth

Server https://192.168.64.11:8443
openshift v3.6.0+c4dd4cf
kubernetes v1.6.1+5115d708d7

$ minishift version
minishift v1.6.0+7a71565
Steps To Reproduce
  • Create a DC with an image trigger setting auto to false:
oc describe dc fck
Name:		fck
Namespace:	myproject
Created:	2 minutes ago
Labels:		syndesis.io/revision-id=2
Annotations:	USERNAME=developer
Latest Version:	Not deployed
Selector:	integration=fck
Replicas:	1
Triggers:	Config, Image(fck@latest, auto=false)
Strategy:	Rolling
Template:
  Labels:	integration=fck
  Containers:
   fck:
    Image:
    Port:	8778/TCP
    Volume Mounts:
      /deployments/config from secret-volume (rw)
    Environment Variables:	<none>
  Volumes:
   secret-volume:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	fck

Latest Deployment:	<none>

No events.

Triggers look like:

  triggers:
  - type: ConfigChange
  - imageChangeParams:
      containerNames:
      - fck
      from:
        kind: ImageStreamTag
        name: fck:latest
        namespace: myproject
    type: ImageChange

ImageStream is

apiVersion: v1
kind: ImageStream
metadata:
  creationTimestamp: 2017-10-06T17:49:05Z
  generation: 1
  name: fck
  namespace: myproject
  resourceVersion: "6222"
  selfLink: /oapi/v1/namespaces/myproject/imagestreams/fck
  uid: a598734d-aabe-11e7-80dd-9a3aa9bfc000
spec: {}
status:
  dockerImageRepository: 172.30.1.1:5000/myproject/fck
  tags:
  - items:
    - created: 2017-10-06T17:50:56Z
      dockerImageReference: 172.30.1.1:5000/myproject/fck@sha256:205eae2dd5dd69d361f8a1e4fe03366670cb9f4ffffddb8b3d9c25e386a72b35
      generation: 1
      image: sha256:205eae2dd5dd69d361f8a1e4fe03366670cb9f4ffffddb8b3d9c25e386a72b35
    tag: latest
  • Then when doing a oc rollout latest fck
Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason				Message
  ---------	--------	-----	----				-------------	--------	------				-------
  3m		3m		14	{deploymentconfig-controller }			Warning		DeploymentCreationFailed	Couldn't deploy version 1: ReplicationController "fck-1" is invalid: spec.template.spec.containers[0].image: Required value
  2m		2m		11	{deploymentconfig-controller }			Warning		DeploymentCreationFailed	Couldn't deploy version 2: ReplicationController "fck-2" is invalid: spec.template.spec.containers[0].image: Required value```
Current Result

The strange thing is, that it oc rollout sometimes work, sometimes not. E.g. when I run oc rollout right after a build it seems to work. If I use another tool to do the deployment (like the fabric8 openshift Java client) which fails, then oc rollout fails, too, afterwards (with the error above). But the DC just looks the same (and has not changed)

Expected Result
  • A manual deployment should do the same as a trigger deployment: Copy over the ImageStreamImage reference of the ImageStreamTag specified in the trigger, if the image name is not yet initialised.
@rhuss
Copy link
Contributor Author

rhuss commented Oct 6, 2017

I did some further investigation. Here is a diff of two DCs fca and hsv. oc rollout works for hsv but not for fca (with the symptoms described above):

diff -u /tmp/fca_after.yml /tmp/hsv_before.yml
--- /tmp/fca_after.yml	2017-10-06 20:35:31.000000000 +0200
+++ /tmp/hsv_before.yml	2017-10-06 20:18:28.000000000 +0200
@@ -3,19 +3,19 @@
 metadata:
   annotations:
     USERNAME: developer
-  creationTimestamp: 2017-10-06T18:33:40Z
-  generation: 2
+  creationTimestamp: 2017-10-06T18:17:41Z
+  generation: 1
   labels:
     syndesis.io/revision-id: "2"
-  name: fca
+  name: hsv
   namespace: myproject
-  resourceVersion: "7324"
-  selfLink: /oapi/v1/namespaces/myproject/deploymentconfigs/fca
-  uid: dff2ee69-aac4-11e7-80dd-9a3aa9bfc000
+  resourceVersion: "6928"
+  selfLink: /oapi/v1/namespaces/myproject/deploymentconfigs/hsv
+  uid: a4408707-aac2-11e7-80dd-9a3aa9bfc000
 spec:
   replicas: 1
   selector:
-    integration: fca
+    integration: hsv
   strategy:
     activeDeadlineSeconds: 21600
     resources: {}
@@ -30,12 +30,12 @@
     metadata:
       creationTimestamp: null
       labels:
-        integration: fca
+        integration: hsv
     spec:
       containers:
       - image: ' '
         imagePullPolicy: Always
-        name: fca
+        name: hsv
         ports:
         - containerPort: 8778
           name: jolokia
@@ -55,35 +55,28 @@
       - name: secret-volume
         secret:
           defaultMode: 420
-          secretName: fca
+          secretName: hsv
   test: false
   triggers:
   - type: ConfigChange
   - imageChangeParams:
       containerNames:
-      - fca
+      - hsv
       from:
         kind: ImageStreamTag
-        name: fca:latest
+        name: hsv:latest
         namespace: myproject
     type: ImageChange
 status:
   availableReplicas: 0
   conditions:
-  - lastTransitionTime: 2017-10-06T18:33:40Z
-    lastUpdateTime: 2017-10-06T18:33:40Z
+  - lastTransitionTime: 2017-10-06T18:17:41Z
+    lastUpdateTime: 2017-10-06T18:17:41Z
     message: Deployment config does not have minimum availability.
     status: "False"
     type: Available
-  - lastTransitionTime: 2017-10-06T18:35:15Z
-    lastUpdateTime: 2017-10-06T18:35:15Z
-    message: 'ReplicationController "fca-1" is invalid: spec.template.spec.containers[0].image:
-      Required value'
-    reason: ReplicationControllerCreateError
-    status: "False"
-    type: Progressing
-  latestVersion: 1
-  observedGeneration: 2
+  latestVersion: 0
+  observedGeneration: 1
   replicas: 0
   unavailableReplicas: 0
   updatedReplicas: 0

The only substantial difference is, that one is at latestVersion: 0, the other in latestVersion: 1 (because there has been already a deployment try via client library). Could it be that oc rollout behaves different based on the this datum ?

For completeness sake, here are both DCs:

fca (bad)
apiVersion: v1
kind: DeploymentConfig
metadata:
  annotations:
    USERNAME: developer
  creationTimestamp: 2017-10-06T18:33:40Z
  generation: 2
  labels:
    syndesis.io/revision-id: "2"
  name: fca
  namespace: myproject
  resourceVersion: "7324"
  selfLink: /oapi/v1/namespaces/myproject/deploymentconfigs/fca
  uid: dff2ee69-aac4-11e7-80dd-9a3aa9bfc000
spec:
  replicas: 1
  selector:
    integration: fca
  strategy:
    activeDeadlineSeconds: 21600
    resources: {}
    rollingParams:
      intervalSeconds: 1
      maxSurge: 25%
      maxUnavailable: 25%
      timeoutSeconds: 600
      updatePeriodSeconds: 1
    type: Rolling
  template:
    metadata:
      creationTimestamp: null
      labels:
        integration: fca
    spec:
      containers:
      - image: ' '
        imagePullPolicy: Always
        name: fca
        ports:
        - containerPort: 8778
          name: jolokia
          protocol: TCP
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /deployments/config
          name: secret-volume
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
      - name: secret-volume
        secret:
          defaultMode: 420
          secretName: fca
  test: false
  triggers:
  - type: ConfigChange
  - imageChangeParams:
      containerNames:
      - fca
      from:
        kind: ImageStreamTag
        name: fca:latest
        namespace: myproject
    type: ImageChange
status:
  availableReplicas: 0
  conditions:
  - lastTransitionTime: 2017-10-06T18:33:40Z
    lastUpdateTime: 2017-10-06T18:33:40Z
    message: Deployment config does not have minimum availability.
    status: "False"
    type: Available
  - lastTransitionTime: 2017-10-06T18:35:15Z
    lastUpdateTime: 2017-10-06T18:35:15Z
    message: 'ReplicationController "fca-1" is invalid: spec.template.spec.containers[0].image:
      Required value'
    reason: ReplicationControllerCreateError
    status: "False"
    type: Progressing
  latestVersion: 1
  observedGeneration: 2
  replicas: 0
  unavailableReplicas: 0
  updatedReplicas: 0
hsv (good)
apiVersion: v1
kind: DeploymentConfig
metadata:
  annotations:
    USERNAME: developer
  creationTimestamp: 2017-10-06T18:17:41Z
  generation: 1
  labels:
    syndesis.io/revision-id: "2"
  name: hsv
  namespace: myproject
  resourceVersion: "6928"
  selfLink: /oapi/v1/namespaces/myproject/deploymentconfigs/hsv
  uid: a4408707-aac2-11e7-80dd-9a3aa9bfc000
spec:
  replicas: 1
  selector:
    integration: hsv
  strategy:
    activeDeadlineSeconds: 21600
    resources: {}
    rollingParams:
      intervalSeconds: 1
      maxSurge: 25%
      maxUnavailable: 25%
      timeoutSeconds: 600
      updatePeriodSeconds: 1
    type: Rolling
  template:
    metadata:
      creationTimestamp: null
      labels:
        integration: hsv
    spec:
      containers:
      - image: ' '
        imagePullPolicy: Always
        name: hsv
        ports:
        - containerPort: 8778
          name: jolokia
          protocol: TCP
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /deployments/config
          name: secret-volume
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
      volumes:
      - name: secret-volume
        secret:
          defaultMode: 420
          secretName: hsv
  test: false
  triggers:
  - type: ConfigChange
  - imageChangeParams:
      containerNames:
      - hsv
      from:
        kind: ImageStreamTag
        name: hsv:latest
        namespace: myproject
    type: ImageChange
status:
  availableReplicas: 0
  conditions:
  - lastTransitionTime: 2017-10-06T18:17:41Z
    lastUpdateTime: 2017-10-06T18:17:41Z
    message: Deployment config does not have minimum availability.
    status: "False"
    type: Available
  latestVersion: 0
  observedGeneration: 1
  replicas: 0
  unavailableReplicas: 0
  updatedReplicas: 0

@rhuss
Copy link
Contributor Author

rhuss commented Oct 6, 2017

It would help me also when I get to know the magic that oc rollout does in addition to add 1 to latestVersion. Does the client really dig into the imagestream and updates the image from the client side ?

@rhuss
Copy link
Contributor Author

rhuss commented Oct 6, 2017

FYI, I think I could now mimic the behaviour of oc rollout on the client side, too (just updating the image myself with the dockerImageReference form the ImageStreamTag --> https://github.com/rhuss/ipaas-rest/blob/45eb7e9dfc39998b00fa828d2c4fabea45ffc8cb/openshift/src/main/java/io/syndesis/openshift/OpenShiftServiceImpl.java#L65-L88

But still, I think that oc rollout should work consistently regardless whether lastVersion number is 0 or not. Also increasing lastVersion number, which triggers a deployment should check for ImageChange triggers and do the same magic with updating the image: field.

@tnozicka
Copy link
Contributor

quickly looking at this, Couldn't deploy version 1: ReplicationController "fck-1" is invalid: spec.template.spec.containers[0].image: Required value should be fixed by #17539

@rhuss
Copy link
Contributor Author

rhuss commented Jan 29, 2018

Cool ! @tnozicka quick question: Which version of Origin / OCP will have this fix ?

@tnozicka
Copy link
Contributor

tnozicka commented Feb 7, 2018

It will be part of 3.9.x

@tnozicka
Copy link
Contributor

tnozicka commented Feb 8, 2018

backport to 3.7 #18524

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 9, 2018
@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 8, 2018
@openshift-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/apps kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P2
Projects
None yet
Development

No branches or pull requests

6 participants