Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPFailover was broken for alpha.1 #9102

Merged
merged 2 commits into from
Jun 3, 2016

Conversation

smarterclayton
Copy link
Contributor

/var/lib/origin is now a docker volume, which results in the contents
being lost. Moved it to /var/lib/ipfailover, and added a slightly better
error message when no arguments are provided.

Fixes #9077

@ramr will repush this image as v1.3.0-alpha.1 afterwards

@smarterclayton
Copy link
Contributor Author

@sdodson @knobunc PTAL, this is p0 on alpha.1

@smarterclayton
Copy link
Contributor Author

[test]

@knobunc
Copy link
Contributor

knobunc commented Jun 1, 2016

Makes sense, and the change LGTM.

@sdodson
Copy link
Member

sdodson commented Jun 1, 2016

LGTM, this problem was introduced after 3.2 was branched FWIW.

/var/lib/origin is now a docker volume, which results in the contents
being lost. Moved it to /var/lib/ipfailover, and added a slightly better
error message when no arguments are provided.
@smarterclayton
Copy link
Contributor Author

Adding an e2e test to verify that ipfailover actually starts (which we never had before). We need an extended test for it at some point.

@knobunc
Copy link
Contributor

knobunc commented Jun 1, 2016

@smarterclayton Added the need for that test to our "moar tests" card. https://trello.com/c/GGqetI5f/65-epic-test-networking-integration-tests

@smarterclayton
Copy link
Contributor Author

[test]

@@ -36,7 +36,7 @@ function cleanup()
dump_container_logs

echo "[INFO] Dumping all resources to ${LOG_DIR}/export_all.json"
oc export all --all-namespaces --raw -o json --config=${ADMIN_KUBECONFIG} > ${LOG_DIR}/export_all.json
oc export all --all-namespaces --raw -o json --config=${ADMIN_KUBECONFIG-} > ${LOG_DIR}/export_all.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We prefer to use ${var:-default} because it's rare that someone wants the set-but-null behavior from ${var-default} and keeping that nuance in mind is hard.

Also, why do we expect this export to work if we don't have $ADMIN_KUBECONFIG? AFAICT, it is always set once a cluster has been successfully stood up. If one hasn't, would we expect oc export to work? If one has, would we expect the output of oc export to be useful without the admin credentials?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be called before ADMIN_KUBECONFIG is set up if the failure happens
early enough.

On Thu, Jun 2, 2016 at 9:02 AM, Steve Kuznetsov [email protected]
wrote:

In hack/test-end-to-end-docker.sh
#9102 (comment):

@@ -36,7 +36,7 @@ function cleanup()
dump_container_logs

echo "[INFO] Dumping all resources to ${LOG_DIR}/export_all.json"
  • oc export all --all-namespaces --raw -o json --config=${ADMIN_KUBECONFIG} > ${LOG_DIR}/export_all.json
  • oc export all --all-namespaces --raw -o json --config=${ADMIN_KUBECONFIG-} > ${LOG_DIR}/export_all.json

We prefer to use ${var:-default} because it's rare that someone wants the
set-but-null behavior from ${var-default} and keeping that nuance in mind
is hard.

Also, why do we expect this export to work if we don't have
$ADMIN_KUBECONFIG? AFAICT, it is always set once a cluster has been
successfully stood up. If one hasn't, would we expect oc export to work?
If one has, would we expect the output of oc export to be useful without
the admin credentials?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/pull/9102/files/4297fea49f7f647fb18c8a8129f766080532216a#r65534428,
or mute the thread
https://github.com/notifications/unsubscribe/ABG_p-HUTQrwzJsc6iz5tytGpsgIt0elks5qHtRmgaJpZM4IrHju
.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I understand this, so I'm asking in what case would that be valid? In that case would we a) expect the cluster to be up already and b) if so, do we expect to get any useful information from the cluster without admin role?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case we don't want export to fail the output, we just don't want
to log.

On Thu, Jun 2, 2016 at 9:40 AM, Clayton Coleman [email protected] wrote:

This can be called before ADMIN_KUBECONFIG is set up if the failure
happens early enough.

On Thu, Jun 2, 2016 at 9:02 AM, Steve Kuznetsov [email protected]
wrote:

In hack/test-end-to-end-docker.sh
#9102 (comment):

@@ -36,7 +36,7 @@ function cleanup()
dump_container_logs

echo "[INFO] Dumping all resources to ${LOG_DIR}/export_all.json"

  • oc export all --all-namespaces --raw -o json --config=${ADMIN_KUBECONFIG} > ${LOG_DIR}/export_all.json
  • oc export all --all-namespaces --raw -o json --config=${ADMIN_KUBECONFIG-} > ${LOG_DIR}/export_all.json

We prefer to use ${var:-default} because it's rare that someone wants
the set-but-null behavior from ${var-default} and keeping that nuance in
mind is hard.

Also, why do we expect this export to work if we don't have
$ADMIN_KUBECONFIG? AFAICT, it is always set once a cluster has been
successfully stood up. If one hasn't, would we expect oc export to work?
If one has, would we expect the output of oc export to be useful without
the admin credentials?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/pull/9102/files/4297fea49f7f647fb18c8a8129f766080532216a#r65534428,
or mute the thread
https://github.com/notifications/unsubscribe/ABG_p-HUTQrwzJsc6iz5tytGpsgIt0elks5qHtRmgaJpZM4IrHju
.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cleanup() has to assume all calls fail, and guard appropriately.

On Thu, Jun 2, 2016 at 9:42 AM, Clayton Coleman [email protected] wrote:

In which case we don't want export to fail the output, we just don't want
to log.

On Thu, Jun 2, 2016 at 9:40 AM, Clayton Coleman [email protected]
wrote:

This can be called before ADMIN_KUBECONFIG is set up if the failure
happens early enough.

On Thu, Jun 2, 2016 at 9:02 AM, Steve Kuznetsov <[email protected]

wrote:

In hack/test-end-to-end-docker.sh
#9102 (comment):

@@ -36,7 +36,7 @@ function cleanup()
dump_container_logs

echo "[INFO] Dumping all resources to ${LOG_DIR}/export_all.json"

  • oc export all --all-namespaces --raw -o json --config=${ADMIN_KUBECONFIG} > ${LOG_DIR}/export_all.json
  • oc export all --all-namespaces --raw -o json --config=${ADMIN_KUBECONFIG-} > ${LOG_DIR}/export_all.json

We prefer to use ${var:-default} because it's rare that someone wants
the set-but-null behavior from ${var-default} and keeping that nuance
in mind is hard.

Also, why do we expect this export to work if we don't have
$ADMIN_KUBECONFIG? AFAICT, it is always set once a cluster has been
successfully stood up. If one hasn't, would we expect oc export to
work? If one has, would we expect the output of oc export to be useful
without the admin credentials?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/openshift/origin/pull/9102/files/4297fea49f7f647fb18c8a8129f766080532216a#r65534428,
or mute the thread
https://github.com/notifications/unsubscribe/ABG_p-HUTQrwzJsc6iz5tytGpsgIt0elks5qHtRmgaJpZM4IrHju
.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright. I was hoping that the detection there was going to be more involved than set +e and ignoring the failure, but alright. Comment about syntax remains.

@@ -36,7 +36,7 @@ function cleanup()
dump_container_logs

echo "[INFO] Dumping all resources to ${LOG_DIR}/export_all.json"
oc export all --all-namespaces --raw -o json --config=${ADMIN_KUBECONFIG} > ${LOG_DIR}/export_all.json
oc export all --all-namespaces --raw -o json --config=${ADMIN_KUBECONFIG:-} > ${LOG_DIR}/export_all.json
Copy link
Contributor

@stevekuznetsov stevekuznetsov Jun 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still maintain that forcing this to fail the line but not the script by injecting the null flag value is pretty far from ideal. Just do

# we only dump resources from etcd if cluster is up; we can use presence of $ADMIN_KUBECONFIG as a proxy for
# detecting if the cluster got brought up correctly
if [[ -n "${ADMIN_KUBECONFIG:-}" ]]; then
    echo "[INFO] Dumping all resources to ${LOG_DIR}/export_all.json"
    oc export all --all-namespaces --raw -o json --config=${ADMIN_KUBECONFIG} > ${LOG_DIR}/export_all.json
fi

We should also make this change to all of the other cleanup functions we have floating around...

@smarterclayton smarterclayton force-pushed the fix_ipfailover branch 4 times, most recently from 5553767 to f7f8646 Compare June 2, 2016 20:44
@smarterclayton
Copy link
Contributor Author

Test is passing, [merge]

@openshift-bot
Copy link
Contributor

openshift-bot commented Jun 3, 2016

continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/4426/) (Image: devenv-rhel7_4311)

@openshift-bot
Copy link
Contributor

Evaluated for origin merge up to f7f8646

@smarterclayton
Copy link
Contributor Author

[test]

@openshift-bot
Copy link
Contributor

Evaluated for origin test up to f7f8646

@openshift-bot
Copy link
Contributor

continuous-integration/openshift-jenkins/test SUCCESS (https://ci.openshift.redhat.com/jenkins/job/test_pr_origin/4426/)

@openshift-bot openshift-bot merged commit 0ce9b1b into openshift:master Jun 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants