Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

re-enable router integration tests #17755

Closed
deads2k opened this issue Dec 13, 2017 · 17 comments
Closed

re-enable router integration tests #17755

deads2k opened this issue Dec 13, 2017 · 17 comments
Assignees
Labels
area/techdebt component/routing lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P0

Comments

@deads2k
Copy link
Contributor

deads2k commented Dec 13, 2017

They required a golang 1.9 node and were failing without it, so they were disabled in the rebase here: https://github.com/openshift/origin/blob/master/Makefile#L188 .

Specifically, we no longer run COVERAGE_SPEC=' ' DETECT_RACES='false' TIMEOUT='10m' hack/test-go.sh ./test/end-to-end

@openshift/sig-networking

@stevekuznetsov
Copy link
Contributor

Can we determine today:

  • is it possible to run these tests without direct access to the Docker daemon
  • is it possible to run these tests using DIND?
  • can we allocate the bandwidth to get these tests into a container right now?

This is going to become more and more of a pain point the longer we wait. We've been asking for containerized tests in this place and for the registry (/cc @bparees @mfojtik @dmage @miminar ) for a while.

@dmage
Copy link
Contributor

dmage commented Dec 14, 2017

@stevekuznetsov you are talking about test/end-to-end/core.sh, right? Can we split it into smaller pieces and run each piece with its own master api and DinD? Though, to run DinD we still need direct access to the Docker daemon.

@stevekuznetsov
Copy link
Contributor

No, I mean the integration suite that was written for the registry and uses the Docker socket: https://github.com/openshift/image-registry/blob/master/pkg/testframework/master.go#L60-L67

These router tests and those registry tests need to be written to work like e2e -- assume an OpenShift cluster is running, deploy your application under test into the cluster using the Go client, test it, etc.

The mess of test/end-to-end should most likely be re-thought and potentially not live in Bash but for the most part it already just uses the normal oc client to do it's job.

@dmage
Copy link
Contributor

dmage commented Dec 14, 2017

@stevekuznetsov for a while? Those tests were written a month ago and they are written the way that they can be executed in parallel. We have global OpenShift resources called image, and we care a lot about the openshift.io/image.managed annotation, to prevent interference from other tests we need a separate instance of OpenShift for some tests. Also we have the stop-the-world routine pruning, the test for it must inspect all images and image streams in all namespaces, so it requires a separate instance of OpenShift as well.

So, unless we have to deal with global objects in our tests, we cannot use a single OpenShift instance.

@stevekuznetsov
Copy link
Contributor

Those tests were written a month ago

I may be misremembering but when we tried to put integration tests into a container in January or so there were router and registry tests that needed Docker. If I'm not remembering that correctly, my apologies.

So, unless we have to deal with global objects in our tests, we cannot use a single OpenShift instance.

How difficult would it be to make your tests non-disruptive and clean up after themselves? I believe the contract with e.g. test-cmd suites or the normal Ginkgo e2es is similar -- keep track of your global state mutation and clean it up. I wouldn't necessarily ask for you to run your tests in parallel, but your test suite not being in the business of starting OpenShift would be nice.

@bparees
Copy link
Contributor

bparees commented Dec 14, 2017

How difficult would it be to make your tests non-disruptive and clean up after themselves?

basically impossible.

They could run serially to possibly deal w/ the pruning challenges (except that @smarterclayton yells at us every time we create a serial test), but they still need unique configuration for the registry so at a minimum i think they'd need to delete the deployed registry and recreate it. There may be master configuration required as well, at which point there's no solution but to start a new cluster for the test.

@stevekuznetsov
Copy link
Contributor

Are they mostly testing through the Origin API or the registry API? Can we deploy multiple registries, or are they all under the assumption they're a singleton in the cluster?

@dmage
Copy link
Contributor

dmage commented Dec 14, 2017

They are testing through both API, Origin is a configuration storage for the registry. Deploying multiple registries is the way how to scale, they should serve the same content (if they are started with the proper storage backed). You cannot have two integrated registries that will serve different images.

@deads2k deads2k mentioned this issue Dec 14, 2017
25 tasks
@stevekuznetsov
Copy link
Contributor

Right now are the tests serial? I understand your point @bparees about serial but if it's not changing the status quo and stopping us from having to deploy a separate cluster per test to parallelize it ...

@dmage
Copy link
Contributor

dmage commented Dec 15, 2017

The integration tests in the image-registry repository are parallel.

@bparees
Copy link
Contributor

bparees commented Dec 15, 2017

But they're able to run in parallel because they each have their own registry and cluster (effectively) right? If they ran the way Steve wants, they'd need to be serial to avoid interfering with each other. (among other changes that would be required to manage the unique configuration each test requires)

@dmage
Copy link
Contributor

dmage commented Dec 15, 2017

Right.

@smarterclayton
Copy link
Contributor

Did these get reenabled? @deads2k @knobunc ?

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 24, 2018
@stevekuznetsov
Copy link
Contributor

@deads2k @knobunc

@openshift-bot
Copy link
Contributor

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 25, 2018
@openshift-bot
Copy link
Contributor

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/techdebt component/routing lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. priority/P0
Projects
None yet
Development

No branches or pull requests

8 participants