Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce custom Grafana for openshift prometheus. #17037

Merged
merged 1 commit into from
Dec 7, 2017
Merged

Introduce custom Grafana for openshift prometheus. #17037

merged 1 commit into from
Dec 7, 2017

Conversation

mrsiano
Copy link

@mrsiano mrsiano commented Oct 25, 2017

A template creates a custom Grafana instance preconfigured to gather Prometheus openshift metrics.
It uses OAuth token to login openshift Prometheus.

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 25, 2017
@danmcp danmcp removed their assignment Oct 25, 2017
@mrsiano
Copy link
Author

mrsiano commented Oct 25, 2017

@smarterclayton
Copy link
Contributor

/assign

2. ```oc new-app -f grafana-ocp.yaml```
3. ```oc expose svc grafana-ocp```
4. grab the grafana url ``` oc get route |awk 'NR==2 {print $2}' ```
5. grab the ocp token, from openshift master run: ```oc sa get-token management-admin -n management-infra```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command doesn't make sense. What is management-infra?

Did you create a user called management-infra and a namespace called management-infra? What about the grafana namespace?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeremyeder

IIUC, the grafana server does not use access authentication.

This command is a way to get the admin token for the prometheus server, using the prom-proxy [1] authentication proxy.

note:
a. running a playbook using openshift-ansible will create this sa and namespace
b. when installing prometheus using the prometheus.yaml example (line 31 in reademe) it will use this token to authenticate.

[1] https://github.com/openshift/origin/blob/master/examples/prometheus/prometheus.yaml#L113

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s.

alternatively you can use kube-system namespace and the prometheus sa

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but this (using kube-system namespace) will not work with promethus install using openshift-ansible [1]

[1] https://github.com/openshift/openshift-ansible/blob/master/roles/openshift_prometheus/defaults/main.yaml#L5

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will change that

- name: data
mountPath: "/root/go/src/github.com/grafana/grafana/data"
command:
- "./bin/grafana-server"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a route to this template? Here is an example:
https://github.com/openshift/origin/blob/master/examples/prometheus/prometheus.yaml#L52

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note:

when implementing the route, note that here the grafana server run without tls, so outside port is 80 not 443 and you do not need the tls part

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s.

It will be nicer to run grafana with tls ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

@mrsiano mrsiano Oct 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeremyeder @yaacov
It dose not work, I tried to resolve that already thats why I use oc expose command, it needs further investigation, since configuring grafana to run with https[1] needs cert file, the key is not enough.

[1]
https://github.com/grafana/grafana/blob/master/conf/defaults.ini#L29

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's OK, let's handle TLS in a subsequent PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's OK, let's handle TLS in a subsequent PR.

Sounds right by me 👍

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1
So ill revert the https for now and will handle it later on by different patch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeremyeder route is added 🍰 , PTAL

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me. I am able to create dashboards now :-)

@openshift-ci-robot openshift-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 26, 2017
Note: make sure to have openshift prometheus deployed.
(https://github.com/openshift/origin/tree/master/examples/prometheus)

1. ```oc create project grafana```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is oc create namespace

oc create project does not work

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be:
oc new-project grafana

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes 👍 create namespace is nicer :-)
oc create namespace grafana

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix that. 10X


1. ```oc create project grafana```
2. ```oc new-app -f grafana-ocp.yaml```
3. ```oc expose svc grafana-ocp```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once the route will be added to the yml file, we will not need to expose the service any more

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mrsiano @jeremyeder

Lets submit issue

I understood that in this PR we will add route for http ( port 80, without tls )
and in a later PR update the grafana config file and update route for https ( port 443, with tls )

did I understand correctly ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

p.s

  1. sttring grafana for https is a setting in the config file [1] ( we have the key/cert from openshift )

[1] https://github.com/grafana/grafana/blob/master/conf/defaults.ini#L29

@mrsiano
Copy link
Author

mrsiano commented Oct 27, 2017 via email

@mrsiano
Copy link
Author

mrsiano commented Oct 28, 2017

/test extended_networking_minimal

@mrsiano
Copy link
Author

mrsiano commented Oct 28, 2017

/test extended_conformance_gce

@yaacov
Copy link
Member

yaacov commented Oct 29, 2017

LGTM 👍

@openshift-ci-robot
Copy link

@mrsiano: you cannot LGTM your own PR.

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Author

@mrsiano mrsiano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jeremyeder please review

2. ```oc new-app -f grafana-ocp.yaml```
3. grab the grafana url ``` oc get route |awk 'NR==2 {print $2}' ```
4. grab the ocp token, from openshift master run: ```oc sa get-token prometheus -n kube-system```
5. browse to grafana via browser and add new prometheus DS to grafana.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • change DS to datasource (could be confused with DaemonSet).
  • mention that the data source URL is the prometheus route, and put an example URL in here

https://prometheus-kube-system.apps.example.com

- name: data
mountPath: "/root/go/src/github.com/grafana/grafana/data"
command:
- "./bin/grafana-server"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me. I am able to create dashboards now :-)

@jeremyeder
Copy link
Contributor

Few small nits about the README but otherwise this LGTM for phase 1.

@@ -0,0 +1,35 @@
FROM centos:7
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this needs to be in origin? We're unlikely to build it, so it would be better as a standalone repo.

Copy link
Author

@mrsiano mrsiano Oct 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, ok, I've just move it, I did not see any reason to have it here either.
@jeremyeder what do you think ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically the Dockerfile? Fine to move. @mrsiano you can put it in https://github.com/openshift/svt/tree/master/dockerfiles

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1. ```oc create namespace grafana```
2. ```oc new-app -f grafana-ocp.yaml```
3. grab the grafana url ``` oc get route |awk 'NR==2 {print $2}' ```
4. grab the ocp token, from openshift master run: ```oc sa get-token prometheus -n kube-system```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarterclayton hi, do you have preference on how to get a token for the Prometheus auth proxy ?

Here we assume the namespace is kube-system, but if we install Prometheus using openshift-ansible [1] the namespace will be prometheus.

[1] https://github.com/openshift/openshift-ansible/blob/master/roles/openshift_prometheus/defaults/main.yaml#L5

cc @zgalor @cben @joelddiaz

Copy link
Author

@mrsiano mrsiano Oct 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yaacov
I've started to work on another patch, to automate the deployment process.

I use this command,
oc get route --all-namespaces |grep -i prometheus |awk '{print $3}'

let's merge this patch for initial stage, then lets work on other things like https, automation and so.
@smarterclayton

@mrsiano
Copy link
Author

mrsiano commented Oct 30, 2017 via email

@openshift-ci-robot openshift-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Oct 30, 2017
@mrsiano
Copy link
Author

mrsiano commented Oct 30, 2017

@jeremyeder @smarterclayton
Iv'e update this patch to include only the ocp template for grafana, please review

Copy link
Contributor

@jeremyeder jeremyeder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@smarterclayton
Copy link
Contributor

Allowed as a tech preview example. I would expect in 3.8/3.9 for you to continue to refine this and deal with all questions related. We are not productizing this.

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 7, 2017
@openshift-merge-robot openshift-merge-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 7, 2017
4. grab the ocp token, from openshift master run: ```oc sa get-token prometheus -n kube-system```
5. browse to grafana datasource's and add new prometheus datasource.
6. grab the prometheus url via ```oc get route -n kube-system prometheus |awk 'NR==2 {print $2}'``` and paste the prometheus url e.g https://prometheus-kube-system.apps.example.com
6. paste the token string at the token field.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There're two items "6"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, maybe the prometheus config in Grafana could be automated with a side-car container? Grafana has a REST API that we can use to add datasource. But this is only possible is prom URL and token can be found programmatically.

Just for information I did some POC for automatic grafana setup with prometheus in openshift, I don't know if it does have some interest for you: https://github.com/jotak/discomon
(The OS template: https://github.com/jotak/discomon/blob/master/prometheus-grafana-discovery.yml ). For POC I didn't care about securing access)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jotak yes we will drive grafana via api. #17114 is the start.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@jeremyeder
Copy link
Contributor

Per #17037 (comment) I thought this should have merged. The outstanding requested changes by @smarterclayton were made (we moved the dockerfile out to our own repo here https://github.com/openshift/svt/tree/master/dockerfiles/grafana-ocp)

Is there some incorrect flag/label to explain why it's stuck?

@smarterclayton smarterclayton added the kind/bug Categorizes issue or PR as related to a bug. label Nov 10, 2017
@dobbymoodge
Copy link
Contributor

/retest

@jeremyeder
Copy link
Contributor

Just looked at the detail links next to each of the failures. Can't see how they're elated to this PR.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@mrsiano
Copy link
Author

mrsiano commented Nov 11, 2017

/test extended_conformance_install_update

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

A template creates a custom Grafana instance preconfigured to gather Prometheus openshift metrics.
It is uses OAuth token to login openshift Prometheus.
@openshift-merge-robot openshift-merge-robot removed the lgtm Indicates that a PR is ready to be merged. label Nov 12, 2017
@mrsiano
Copy link
Author

mrsiano commented Nov 12, 2017

i always seeing this error message [ERROR] Can not determine install and upgrade version for the atomic-openshift-utils package by extended_conformance_install_update.

any idea ?!
@smarterclayton could add back again the lgtm flag, I've just update the readme typo.

@mrsiano
Copy link
Author

mrsiano commented Nov 14, 2017

/retest

@openshift-ci-robot
Copy link

openshift-ci-robot commented Nov 14, 2017

@mrsiano: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/openshift-jenkins/extended_conformance_install_update ca061f2 link /test extended_conformance_install_update

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@0xmichalis
Copy link
Contributor

/retest

@0xmichalis
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 7, 2017
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kargakis, mrsiano, smarterclayton

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue.

@openshift-merge-robot openshift-merge-robot merged commit 08a7f8d into openshift:master Dec 7, 2017
openshift-merge-robot added a commit that referenced this pull request Jan 28, 2018
Automatic merge from submit-queue (batch tested with PRs 17114, 18274).

Grafana template automation.

eventually this patch should be compatible and jenkins friendly.


#17037
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. retest-not-required size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.