Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding synchronization and other features to extended test cluster loader. #17894

Merged

Conversation

jmencak
Copy link
Contributor

@jmencak jmencak commented Dec 20, 2017

Fixes:

  • number of templates no longer ignored
  • tuningsets no longer ignored for templates
  • maxRetries reflected the number of all tries, not retries

New features:

  • label-based pod post deployment synchronization at pod, template
    or global level
  • creation of config maps and secrets from files
  • support for getting parameter values from the environment

@jmencak jmencak requested a review from sjug December 20, 2017 08:30
@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Dec 20, 2017
@jmencak
Copy link
Contributor Author

jmencak commented Dec 20, 2017

@sjug creating this in parallel with #17072 for review purposes, while you're installing go ;-)

@danwinship danwinship removed their request for review December 20, 2017 14:20
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 21, 2017
@jmencak jmencak force-pushed the cluster-loader-sync branch from ade2888 to 3adc981 Compare January 2, 2018 08:25
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 2, 2018
@jmencak
Copy link
Contributor Author

jmencak commented Jan 2, 2018

Flakes on:

  • "ci/openshift-jenkins/cmd"
    test/cmd/observe.sh:34: executing 'oc observe services --once --all-namespaces -a "bad{ .metadata.annotations.unset }key" --strict-templates' expecting failure and text 'annotations is not found': the output content test failed
  • ci/openshift-jenkins/extended_conformance_install
    rm: cannot remove ‘/tmp/etcd/member’: Permission denied

@jmencak jmencak force-pushed the cluster-loader-sync branch 2 times, most recently from 29914ac to ca85550 Compare January 2, 2018 11:16
@jmencak
Copy link
Contributor Author

jmencak commented Jan 2, 2018

Flakes on:

  • ci/openshift-jenkins/extended_conformance_gce
    rm: cannot remove ‘/tmp/etcd’: Operation not permitted
  • ci/openshift-jenkins/integration
    error retrieving resource lock kube-system/kube-controller-manager: Get https://127.0.0.1:21143/api/v1/namespaces/kube-system/configmaps/kube-controller-manager: dial tcp 127.0.0.1:21143: getsockopt: connection refused

@jmencak
Copy link
Contributor Author

jmencak commented Jan 3, 2018

@sjug PTAL

Copy link
Contributor

@sjug sjug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few changes, looking good overall.

Number int `mapstructure:"num"`
Basename string
Tuning string
Configmaps map[string]interface{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, missing a space?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure where, formatted by gofmt, also used go tool vet

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The columns should align, not sure why you're not getting proper output.

Copy link
Contributor Author

@jmencak jmencak Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which columns do not align? I do not see any mis-alignment. Again, the code is formatted with "gofmt -s -d context.go" produces no diff. Please be specific.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one I've tagged, line 26?
Configmaps is aligned, but map[string]interface{} is not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just on github... ffs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, was getting desperate...

@@ -33,9 +36,22 @@ type ClusterLoaderObjectType struct {
Image string
Basename string
File string
Sync SyncObjectType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is SyncObjectType nested in two structs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to perform synchronization both on template/pod and cluster-loader wide level.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to discuss this further.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not currently used but may be helpful in the future.

// SyncObjectType is nested object type for cluster loader synchronisation functionality
type SyncObjectType struct {
Server struct {
Enable bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, should probably be Enabled

if v != 0 && v != "" {
args = append(args, "-p")
args = append(args, fmt.Sprintf("%s=%v", k, v))
val := v
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand what you're trying to do here, but it's not necessary as k & v are already new variables.

// Parameter not defined, see if it is defined in the environment.
var found bool
val, found = os.LookupEnv(fmt.Sprintf("%s", k))
if found == false {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

found is boolean so we don't need to explicitly compare it.

if found {
    ...

// Create secrets
if p.Secrets != nil {
// Secrets defined
for k, v := range p.Secrets {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block looks really similar to the configmap version L87-91 😄
Can we make it more generic too?

return args, err
}

func getSecretArgs(k string, v interface{}) (args []string, err error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, very similar to getConfigMapArgs.

@@ -41,12 +48,58 @@ func ParsePods(jsonFile string) (configStruct kapiv1.Pod) {
return
}

// Wait for pods to go into Running or "not Running" state
func SyncPods(c kclientset.Interface, ns string, selectors map[string]string, timeout time.Duration, negate bool) (err error) {
Copy link
Contributor

@sjug sjug Jan 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has already been implemented a bunch of times in the e2e framework (example). Do we want another version here to maintain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So how do you implement waiting for a pod to go into anything but "Running" state using the code you pointed out?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually we would expect a specific state.

Copy link
Contributor Author

@jmencak jmencak Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see why. Expecting anything but "not Running" instead of "Complete/Error/..." is a valid reason for synchronization.

edit: On second thought, waiting for Complete + setting a timeout for complete might actually be a better idea. Will rewrite then.

}

return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be return nil or what error is this?

}
}
}

if sync.Running {
for _, ns := range namespaces {
err := SyncRunningPods(c, ns, sync.Selectors, time.Duration(sync.Timeout)*time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time.Duration(sync.Timeout)*time.Second

This kind of pattern has happened a lot in here, I suggest using time.ParseDuration rather than hardcoding. This is something I've done in the perf-tests cluster loader version.

@jmencak jmencak force-pushed the cluster-loader-sync branch from ca85550 to 71f7984 Compare January 4, 2018 12:55
@jmencak
Copy link
Contributor Author

jmencak commented Jan 4, 2018

@sjug please check that all your concerns have been addressed.

// Server is the webservice that will syncronize the start and stop of Pods
func Server(c *PodCount) error {
// Server is the webservice that will synchronize the start and stop of Pods
func Server(c *PodCount, Port int, awaitShutdown bool) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Port should not be capitalized.

http.HandleFunc("/start", handleStart(startHandler, c))
http.HandleFunc("/stop", handleStop(stopHandler, c))
if Port <= 0 {
Copy link
Contributor

@sjug sjug Jan 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could Port be < 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps because it is a signed int and user defined it as negative? Most code I've seen in origin uses int for ports.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather it just throw an error in that edge case, but this will be fine, maybe add a log message?

maxRetries = 4

// Poll every two seconds
Poll = 2 * time.Second
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we need to export this?

@jmencak jmencak force-pushed the cluster-loader-sync branch 5 times, most recently from de27297 to e4d5665 Compare January 5, 2018 12:21
…loader.

Fixes:
- number of templates no longer ignored
- tuningsets no longer ignored for templates
- maxRetries reflected the number of all tries, not retries

New features:
- label-based pod post deployment synchronization at pod, template
  or global level
- creation of config maps and secrets from files
- support for getting parameter values from the environment
@jmencak jmencak force-pushed the cluster-loader-sync branch from e4d5665 to dd2366b Compare January 5, 2018 13:43
@jmencak
Copy link
Contributor Author

jmencak commented Jan 5, 2018

@sjug AFAIK all requests for change addressed.

@sjug
Copy link
Contributor

sjug commented Jan 5, 2018

/LGTM @jmencak

@jmencak
Copy link
Contributor Author

jmencak commented Jan 6, 2018

/assign @stevekuznetsov

@knobunc
Copy link
Contributor

knobunc commented Jan 8, 2018

/lgtm
@rajatchopra PTAL

@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jan 8, 2018
@knobunc
Copy link
Contributor

knobunc commented Jan 8, 2018

/approve

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jmencak, knobunc, sjug

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@jmencak
Copy link
Contributor Author

jmencak commented Jan 9, 2018

Flake on ci/openshift-jenkins/extended_conformance_gce "dial tcp 10.142.0.5:10250: getsockopt: connection refused"

@jmencak
Copy link
Contributor Author

jmencak commented Jan 9, 2018

/test extended_conformance_gce

@openshift-merge-robot
Copy link
Contributor

/test all [submit-queue is verifying that this PR is safe to merge]

@openshift-merge-robot
Copy link
Contributor

Automatic merge from submit-queue.

@openshift-merge-robot openshift-merge-robot merged commit 887de1b into openshift:master Jan 9, 2018
ahardin-rh pushed a commit to ahardin-rh/openshift-docs that referenced this pull request Jan 24, 2018
Documenting synchronization primitives and the possibility
to create ConfigMaps and Secrets from files as implemented by
openshift/origin#17894

(cherry picked from commit c85b8a9) xref:openshift#7297
@jmencak jmencak deleted the cluster-loader-sync branch July 11, 2019 09:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants