Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature:Prometheus][Conformance] Prometheus when installed to the cluster should start and expose a secured proxy and unsecured metrics [Suite:openshift/conformance/parallel] 1m51s #17901

Closed
tnozicka opened this issue Dec 20, 2017 · 14 comments
Assignees
Labels
component/metrics kind/test-flake Categorizes issue or PR as related to test flakes. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/P1

Comments

@tnozicka
Copy link
Contributor

https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/17827/test_pull_request_origin_extended_conformance_gce/13358/

/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/test/extended/prometheus/prometheus.go:43
Did not find tsdb_samples_appended_total, tsdb_head_samples_appended_total, or prometheus_tsdb_head_samples_appended_total in:
map[string]*io_prometheus_client.MetricFamily(nil),
Expected
    <bool>: false
to be true
/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/test/extended/prometheus/prometheus.go:83
@tnozicka tnozicka added kind/test-flake Categorizes issue or PR as related to test flakes. priority/P1 labels Dec 20, 2017
@mfojtik
Copy link
Contributor

mfojtik commented Jan 2, 2018

@mfojtik
Copy link
Contributor

mfojtik commented Jan 2, 2018

@bparees the second failure in the linked test report seems to be build related

@bparees
Copy link
Contributor

bparees commented Jan 2, 2018

@bparees the second failure in the linked test report seems to be build related

  1. Separate flake issues should be opened for separate failures. Otherwise we're just polluting the discussion.

  2. if you're referring to this: StdErr: "Error from server (Conflict): Operation cannot be fulfilled on clusterrolebindings.authorization.openshift.io \"system:build-strategy-custom\": the object has been modified; please apply your changes to the latest version and try again", there's an issue for it already: [Feature:Builds] forcePull should affect pulling builder images ForcePull test case execution docker #17596

@simonpasquier
Copy link
Contributor

I had a look at the logs and it seems that the Prometheus pod never gets to the ready state (and the curl exit code is 7 which means "Failed to connect to host"). So I'd say that the error is unrelated to Prometheus itself but rather Kubernetes not being able to schedule the pod?

From https://openshift-gce-devel.appspot.com/build/origin-ci-test/pr-logs/pull/17726/test_pull_request_origin_extended_conformance_gce/13541/

Jan  2 12:00:10.427: INFO: unable to get unsecured metrics: host command failed
[...]
Jan  2 12:01:39.828: INFO: unable to get unsecured metrics: host command failed
[...]
Jan  2 12:01:39.949: INFO: prometheus-0   ci-prtest-5a37c28-13541-ig-n-8l85  Pending           [{Initialized True 0001-01-01 00:00:00 +0000 UTC 2018-01-02 11:59:46 +0000 UTC  } {Ready False 0001-01-01 00:00:00 +0000 UTC 2018-01-02 11:59:46 +0000 UTC ContainersNotReady containers with unready status: [prom-proxy prometheus alerts-proxy alert-buffer alertmanager]} {PodScheduled True 0001-01-01 00:00:00 +0000 UTC 2018-01-02 11:59:46 +0000 UTC  }]
[...]
Jan  2 12:01:40.176: INFO: prometheus-0 started at 2018-01-02 11:59:46 +0000 UTC (0+5 container statuses recorded)
Jan  2 12:01:40.176: INFO: 	Container alert-buffer ready: false, restart count 0
Jan  2 12:01:40.176: INFO: 	Container alertmanager ready: false, restart count 0
Jan  2 12:01:40.176: INFO: 	Container alerts-proxy ready: false, restart count 0
Jan  2 12:01:40.176: INFO: 	Container prom-proxy ready: false, restart count 0
Jan  2 12:01:40.176: INFO: 	Container prometheus ready: false, restart count 0

@theute
Copy link

theute commented Feb 1, 2018

@oourfali it seems to be an installer issue rather than an issue in P8s itself ?

@zgalor
Copy link
Contributor

zgalor commented Feb 1, 2018

@tnozicka was prometheus installed using openshift-ansible?
what inventory file was used?

@tnozicka
Copy link
Contributor Author

tnozicka commented Feb 1, 2018

It is installed as part of the test:

ns, host, bearerToken, statsPort = bringUpPrometheusFromTemplate(oc)

func bringUpPrometheusFromTemplate(oc *exutil.CLI) (ns, host, bearerToken string, statsPort int) {

@theute theute removed their assignment Feb 8, 2018
@jcantrill
Copy link
Contributor

/assign @zgalor
/unassign @jcantrill

@zgalor
Copy link
Contributor

zgalor commented Feb 12, 2018

@smarterclayton is this also caused by #18317 as #17529?

@smarterclayton
Copy link
Contributor

smarterclayton commented Feb 12, 2018 via email

@zgalor
Copy link
Contributor

zgalor commented Mar 5, 2018

@tnozicka the bug related to this: https://bugzilla.redhat.com/show_bug.cgi?id=1539987 has been fixed and verified. can this issue be resolved?

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 3, 2018
@tnozicka tnozicka closed this as completed Jun 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/metrics kind/test-flake Categorizes issue or PR as related to test flakes. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. priority/P1
Projects
None yet
Development

No branches or pull requests