Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Register SDN informers synchronously #15354

Merged
merged 2 commits into from
Jul 21, 2017
Merged

Register SDN informers synchronously #15354

merged 2 commits into from
Jul 21, 2017

Conversation

liggitt
Copy link
Contributor

@liggitt liggitt commented Jul 19, 2017

Pick of #15353 and more contained version of #15364

The SDN controller was registering shared informer event handlers in a goroutine, so registration raced with informer start. If the registration lost, then SDN event handlers would never get namespace events.

@openshift-bot
Copy link
Contributor

openshift-bot commented Jul 19, 2017

continuous-integration/openshift-jenkins/merge Waiting: You are in the build queue at position: 14

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to 07c92dd

@openshift-bot
Copy link
Contributor

[Test]ing while waiting on the merge queue

@liggitt
Copy link
Contributor Author

liggitt commented Jul 20, 2017

continuous-integration/openshift-jenkins/test FAILURE (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/3303/) (Base Commit: 12575c5) (PR Branch Commit: 07c92dd)

[test]

@liggitt
Copy link
Contributor Author

liggitt commented Jul 20, 2017

I'm seeing a consistent failure in the ansible install at the "Reconcile Cluster Roles and Cluster Role Bindings and Security Context Constraints." step:

Output to stderr: The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? The connection to the server ip-172-18-10-18.ec2.internal:8443 was refused - did you specify the right host or port? Error from server (Forbidden): User "system:admin" cannot get clusterroles at the cluster scope

@liggitt
Copy link
Contributor Author

liggitt commented Jul 20, 2017

all-in-one master is dying with failed to start SDN plugin controller: User "system:serviceaccount:openshift-infra:sdn-controller" cannot get clusternetworks at the cluster scope

@liggitt
Copy link
Contributor Author

liggitt commented Jul 20, 2017

looks like all the controller roles moved out of the block that auto-reconciles them on server start, which means ansible reconciliation races controllers which kill the process if they hit permission errors long enough

@openshift openshift deleted a comment from deads2k Jul 20, 2017
@liggitt liggitt added this to the 3.6.0 milestone Jul 20, 2017
@openshift-bot
Copy link
Contributor

Evaluated for origin test up to 04fdc79

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pull_request_origin/3338/) (Base Commit: 4c2392b) (PR Branch Commit: 04fdc79)

@liggitt
Copy link
Contributor Author

liggitt commented Jul 20, 2017

@deads2k PTAL at the second commit for 3.6

@deads2k
Copy link
Contributor

deads2k commented Jul 20, 2017

lgtm

@liggitt
Copy link
Contributor Author

liggitt commented Jul 20, 2017

[merge]

@liggitt
Copy link
Contributor Author

liggitt commented Jul 21, 2017

green tests on HEAD^, contains fix for flake other merge jobs are hitting, merging

@liggitt liggitt merged commit 787f4e2 into openshift:release-3.6 Jul 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants