origin:v3.7.2 origin-master-controllers HPA works for deployments but not deploymentconfigs #19045

mshutt · 2018-03-21T00:11:39Z

HPA no longer functions correctly for DeploymentConfigs, still works for Deployments

Version

oc version

oc v3.7.2+26304a3-2
kubernetes v1.7.6+a08f5eeb62
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server [redacted]
openshift v3.7.2+26304a3-2
kubernetes v1.7.6+a08f5eeb62

Steps To Reproduce

Create HPA against a DeploymentConfig

# oc autoscale dc/friendlyhello --min 1 --max 10 --cpu-percent=5

Monitor HPA Status with oc describe hpa/friendlyhello

Current Result

# oc describe hpa/friendlyhello
Name:							friendlyhello
Namespace:						test-dockerfile-build
Labels:							<none>
Annotations:						<none>
CreationTimestamp:					Tue, 20 Mar 2018 23:42:28 +0000
Reference:						DeploymentConfig/friendlyhello
Metrics:						( current / target )
  resource cpu on pods  (as a percentage of request):	<unknown> / 5%
Min replicas:						2
Max replicas:						10
Conditions:
  Type		Status	Reason		Message
  ----		------	------		-------
  AbleToScale	False	FailedGetScale	the HPA controller was unable to get the target's current scale: no kind "Scale" is registered for version "extensions/v1beta1"
Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----				-------------	--------	------		-------
  18m		18m		2	horizontal-pod-autoscaler		Warning		FailedGetScale	no kind "Scale" is registered for version "extensions/v1beta1"
  16m		1m		31	horizontal-pod-autoscaler		Warning		FailedGetScale	no kind "Scale" is registered for version "extensions/v1beta1"

Expected Result

# oc describe hpa/friendlyhello
Name:							friendlyhello
Namespace:						test-dockerfile-build
Labels:							<none>
Annotations:						<none>
CreationTimestamp:					Fri, 26 Jan 2018 22:30:28 +0000
Reference:						DeploymentConfig/friendlyhello
Metrics:						( current / target )
  resource cpu on pods  (as a percentage of request):	0% (0) / 5%
Min replicas:						1
Max replicas:						10
Conditions:
  Type			Status	Reason			Message
  ----			------	------			-------
  AbleToScale		True	ReadyForNewScale	the last scale time was sufficiently old as to warrant a new scale
  ScalingActive		True	ValidMetricFound	the HPA was able to succesfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited	True	TooFewReplicas		the desired replica count was zero
Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason			Message
  ---------	--------	-----	----				-------------	--------	------			-------
  5m		5m		1	horizontal-pod-autoscaler		Normal		SuccessfulRescale	New size: 4; reason: cpu resource utilization (percentage of request) above target

Additional Information

This works correctly for Deployments

Here is the output of LOGLEVEL=6 from origin-master-controllers while it is running the HPA:

Mar 20 23:42:28 r1009.assets.rivet.example.net origin-master-controllers[5957]: I0320 23:42:28.299323       1 graph_builder.go:475] GraphBuilder process object: autoscaling/v1/HorizontalPodAutoscaler, namespace test-dockerfile-build, name friendlyhello, uid 597eecbb-2c98-11e8-8ef4-40a8f02674ac, event type add
<SNIP>
Mar 20 23:42:58 r1009.assets.rivet.example.net origin-master-controllers[5957]: I0320 23:42:58.302131       1 round_trippers.go:405] GET https://r1009.assets.rivet.example.net/apis/apps.openshift.io/v1/namespaces/test-dockerfile-build/deploymentconfigs/friendlyhello/scale 200 OK in 2 milliseconds
Mar 20 23:42:58 r1009.assets.rivet.example.net origin-master-controllers[5957]: I0320 23:42:58.306473       1 round_trippers.go:405] POST https://r1009.assets.rivet.example.net/api/v1/namespaces/test-dockerfile-build/events 201 Created in 3 milliseconds
Mar 20 23:42:58 r1009.assets.rivet.example.net origin-master-controllers[5957]: I0320 23:42:58.306631       1 round_trippers.go:405] PUT https://r1009.assets.rivet.example.net/apis/autoscaling/v1/namespaces/test-dockerfile-build/horizontalpodautoscalers/friendlyhello/status 200 OK in 4 milliseconds
Mar 20 23:42:58 r1009.assets.rivet.example.net origin-master-controllers[5957]: I0320 23:42:58.306638       1 graph_builder.go:475] GraphBuilder process object: autoscaling/v1/HorizontalPodAutoscaler, namespace test-dockerfile-build, name friendlyhello, uid 597eecbb-2c98-11e8-8ef4-40a8f02674ac, event type update
Mar 20 23:42:58 r1009.assets.rivet.example.net origin-master-controllers[5957]: I0320 23:42:58.306728       1 horizontal.go:633] Successfully updated status for friendlyhello
Mar 20 23:42:58 r1009.assets.rivet.example.net origin-master-controllers[5957]: E0320 23:42:58.306840       1 horizontal.go:206] failed to query scale subresource for DeploymentConfig/test-dockerfile-build/friendlyhello: no kind "Scale" is registered for version "extensions/v1beta1"

Here is a raw query of the scale resource the deploymentconfig:

# oc get --raw https://localhost/apis/apps/v1beta1/namespaces/test-dockerfile-build/deployments/busybox/scale | jq .
{
  "kind": "Scale",
  "apiVersion": "apps/v1beta1",
<SNIP>

And here is a raw query of the scale resource for the deployment:

# oc get --raw https://r1009.assets.rivet.example.net/apis/apps.openshift.io/v1/namespaces/test-dockerfile-build/deploymentconfigs/friendlyhello/scale | jq .
{
  "kind": "Scale",
  "apiVersion": "extensions/v1beta1",

This seems somehow inversely related to:

https://bugzilla.redhat.com/show_bug.cgi?id=1549873
#17517

@liggitt - Any thoughts? I am not a go developer (yet), but I'll help any way that I can!

/label bug
/label question

The text was updated successfully, but these errors were encountered:

mshutt · 2018-03-26T16:13:19Z

Heya!

Not sure if this is related, but:

# oc version
oc v3.7.2+5eda3fa-5
kubernetes v1.7.6+a08f5eeb62
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://[redacted]:443
openshift v3.7.2+5eda3fa-5
kubernetes v1.7.6+a08f5eeb62
# oc get --raw https://localhost/apis/extensions/v1beta1 | jq . | egrep scale
      "name": "deployments/scale",
      "name": "replicasets/scale",
      "name": "replicationcontrollers/scale",
# oc get --raw https://localhost/apis/apps/v1beta1 | jq . | egrep scale
      "name": "deployments/scale",
# oc get --raw https://localhost/apis/apps.openshift.io/v1 | jq . | egrep scale
      "name": "deploymentconfigs/scale",
# oc get --raw https://localhost/apis/apps.openshift.io/v1/namespaces/test-dockerfile-build/deploymentconfigs/friendlyhello/scale | jq .  | egrep apiVersion
  "apiVersion": "extensions/v1beta1",

Thoughts?

mshutt · 2018-03-26T16:28:26Z

I've also tried modifying the hpa spec to change the scaleTargetRef 'apiVersion' to extensions/v1beta1 to no avail. Also, I've tried to change the scaleTargetRef apiVersion to apps/v1beta1 to no avail. Finally, I tried to change scaleTargetRef to apps.openshift.io/v1 as it was in 3.7.0 and as you can guess, we get the same error.

mshutt · 2018-03-26T16:29:27Z

This is what oc autoscale creates:

    scaleTargetRef:
      apiVersion: v1
      kind: DeploymentConfig
      name: friendlyhello
    targetCPUUtilizationPercentage: 5

jwforres · 2018-03-27T14:54:16Z

@openshift/sig-pod

jwforres · 2018-03-27T14:55:11Z

i know we already resolved a number of bugs related to this, might be fixed in master already

DirectXMan12 · 2018-03-27T15:13:22Z

Yes, please double-check in master, it should be fixed there.

davidaah · 2018-03-27T23:18:42Z

@DirectXMan12 @jwforres could you clarify the fix in master you are referring to? Unfortunately the similar change applied to master (#17587) was also some time ago so much of the code has been significantly refactored (especially in apiserver/apiserver.go)

as best i can tell, the change from 3.7.0 (where we had working HPA) is related to changing what was returned by the deploymentconfigs/scale subresource introduced in 3.7.1/3.7.2 (#17517) but may need some clarification if possible from @liggitt .

sjenning · 2018-04-12T22:25:21Z

@mshutt could you verify this is fixed in 3.9?

mshutt · 2018-04-13T13:35:53Z

@sjenning We'll be doing the 3.9 upgrade in our lab as soon as is possible. Hoping for early next week. We're all containerized and I saw that another user tripped over the etcd upgrade issue with Origin vs. the paid bits (which I'd reported previously working around by setting the var openshift_etcd_upgrade: false and then re-running byo/config.yml after doing the upgrade plays openshift/openshift-ansible#6931)

liggitt · 2018-04-19T17:09:12Z

This was related to HPA using a faulty client. It is resolved in 3.9. A fix for this in 3.7 is in #19437 but I don't know if there are more 3.7 releases planned

mshutt · 2018-05-03T22:07:30Z

@sjenning @liggitt 3.9 upgrade indeed has fixed this. Thank you all for your tireless efforts!

liggitt · 2018-05-24T04:05:59Z

fixed in the release-3.7 branch in #19437

openshift-ci-robot added the sig/pod label Mar 27, 2018

jwforres assigned DirectXMan12 and sjenning Mar 27, 2018

jwforres added the kind/bug Categorizes issue or PR as related to a bug. label Mar 27, 2018

sjenning removed their assignment Apr 12, 2018

liggitt mentioned this issue Apr 19, 2018

Fix HPA scaling of deployment configs #19437

Merged

liggitt closed this as completed May 24, 2018

sjenning mentioned this issue Jul 5, 2018

HPA showing unkown cpu utilization #19941

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

origin:v3.7.2 origin-master-controllers HPA works for deployments but not deploymentconfigs #19045

origin:v3.7.2 origin-master-controllers HPA works for deployments but not deploymentconfigs #19045

mshutt commented Mar 21, 2018 •

edited

Loading

mshutt commented Mar 26, 2018

mshutt commented Mar 26, 2018 •

edited

Loading

mshutt commented Mar 26, 2018

jwforres commented Mar 27, 2018

jwforres commented Mar 27, 2018

DirectXMan12 commented Mar 27, 2018

davidaah commented Mar 27, 2018 •

edited

Loading

sjenning commented Apr 12, 2018

mshutt commented Apr 13, 2018

liggitt commented Apr 19, 2018

mshutt commented May 3, 2018

liggitt commented May 24, 2018

origin:v3.7.2 origin-master-controllers HPA works for deployments but not deploymentconfigs #19045

origin:v3.7.2 origin-master-controllers HPA works for deployments but not deploymentconfigs #19045

Comments

mshutt commented Mar 21, 2018 • edited Loading

Version

oc version

Steps To Reproduce

Current Result

Expected Result

Additional Information

mshutt commented Mar 26, 2018

mshutt commented Mar 26, 2018 • edited Loading

mshutt commented Mar 26, 2018

jwforres commented Mar 27, 2018

jwforres commented Mar 27, 2018

DirectXMan12 commented Mar 27, 2018

davidaah commented Mar 27, 2018 • edited Loading

sjenning commented Apr 12, 2018

mshutt commented Apr 13, 2018

liggitt commented Apr 19, 2018

mshutt commented May 3, 2018

liggitt commented May 24, 2018

mshutt commented Mar 21, 2018 •

edited

Loading

mshutt commented Mar 26, 2018 •

edited

Loading

davidaah commented Mar 27, 2018 •

edited

Loading