-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🌱 PartitionSet e2e #2642
🌱 PartitionSet e2e #2642
Conversation
/hold |
55d536b
to
ef9b368
Compare
c587d0c
to
4d12faa
Compare
/retest |
@@ -35,6 +36,7 @@ func generatePartition(name string, matchExpressions []metav1.LabelSelectorRequi | |||
for _, label := range labels { | |||
pname = pname + "-" + strings.ToLower(matchLabels[label]) | |||
} | |||
pname = pname[:min(validation.DNS1123SubdomainMaxLength-1, len(pname))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, there is a limit :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a unit test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is useful but I have added it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about collisions? This is why we use the hash in many other places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GenerateName is used to avoid collisions. I have been requested to have the name carrying some way of identifying a partition. The information was previously set to labels.
@@ -89,10 +89,11 @@ func (c *controller) reconcile(ctx context.Context, partitionSet *topologyv1alph | |||
} | |||
|
|||
var matchLabelsMap map[string]map[string]string | |||
dimensions := removeDuplicates(partitionSet.Spec.Dimensions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we put it into validation/cel ?
that way we wouldn't get any duplicates, right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sttts wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no easy way of doing that with CEL. CEL is cautious with costly operations like list iteration, although it would be o(n) here.
The current implementation is tolerant to duplicates and still provide a predictable, not surprising for the user, outcome.
limitations under the License. | ||
*/ | ||
|
||
package partition |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we rename test/e2e/reconciler/partition/partitionset_test.go
to test/e2e/reconciler/partitionset/partitionset_test.go
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can do that
} | ||
return false, spew.Sdump(partitionSet.Status.Conditions) | ||
}, wait.ForeverTestTimeout, 100*time.Millisecond, "expected valid partitionSet") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we check that no Partition
was created ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO we should focus a bit more on the happy cases in e2e, exhaustive negative checking can be fragile as we've learned and requires private fixtures...
partitions, err = partitionClient.Cluster(partitionClusterPath).List(ctx, metav1.ListOptions{}) | ||
require.NoError(t, err, "error retrieving partitions") | ||
if len(partitions.Items) == 1 { | ||
return true, "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we examine Partition.Spec.Selector
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
partitions, err = partitionClient.Cluster(partitionClusterPath).List(ctx, metav1.ListOptions{}) | ||
require.NoError(t, err, "error retrieving partitions") | ||
if len(partitions.Items) == 2 { | ||
return true, "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we check Partition.Spec.Selector
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to validate that we get the right ones ehre
return false, fmt.Sprintf("expected 1 partition, but got %d", len(partitions.Items)) | ||
}, wait.ForeverTestTimeout, 100*time.Millisecond, "expected 1 partition") | ||
|
||
// The following tests are focused on the admission |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd move it to a new separate test that simply checks admission
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought of that but did not do it due to the cost of running a private cluster.
} | ||
_, err = partitionSetClient.Cluster(partitionClusterPath).Create(ctx, errorPartitionSet, metav1.CreateOptions{}) | ||
require.Error(t, err, "error creating partitionSet expected") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we check that a PartitionSet
wasn't created ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, this is standard CEL validation mechanism. If the expression is evaluated to false an error is returned and the resource is not created.
_, err = partitionSetClient.Cluster(partitionClusterPath).Create(ctx, errorPartitionSet, metav1.CreateOptions{}) | ||
require.Error(t, err, "error creating partitionSet expected") | ||
|
||
t.Logf("Character not allowed at first place in matchExpressions values") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we need to check all incorrect values, i'd rely on k8s's validation.
From our perspective the most important thing is to make sure an invalid request will be rejected and won't have any side effect (i.e. deleting a partition).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no kubernetes validation taking place for the admission. This is handled by the CEL expressions I crafted.
_, err = partitionSetClient.Cluster(partitionClusterPath).Create(ctx, errorPartitionSet, metav1.CreateOptions{}) | ||
require.Error(t, err, "error creating partitionSet expected") | ||
|
||
t.Logf("Partition name cut when the label values sum up") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with checking it on a unit test level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't do that apart from copying the CEL expressions directly in the unit test and like, which does not bring value in my opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO we should write these tests on the shared fixture. Generate your labels and shard names with random suffixes to allow these tests to co-exist even with other instances of the same test in parallel, and mark the shards you're creating as not ready for scehduling so as to not impact any other part of the system.
@@ -215,3 +216,16 @@ func partition(shards []*corev1alpha1.Shard, dimensions []string, shardSelectorL | |||
} | |||
return matchLabelsMap | |||
} | |||
|
|||
// removeDuplicates makes sure that a value is not found more than once in the slice. | |||
func removeDuplicates(slice []string) []string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sets.NewString(slice).List()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
@@ -35,6 +36,7 @@ func generatePartition(name string, matchExpressions []metav1.LabelSelectorRequi | |||
for _, label := range labels { | |||
pname = pname + "-" + strings.ToLower(matchLabels[label]) | |||
} | |||
pname = pname[:min(validation.DNS1123SubdomainMaxLength-1, len(pname))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about collisions? This is why we use the hash in many other places.
@@ -48,3 +50,10 @@ func generatePartition(name string, matchExpressions []metav1.LabelSelectorRequi | |||
}, | |||
} | |||
} | |||
|
|||
func min(a, b int) int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO we don't need to accumulate this sort of logic. Just inline it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inlining is done by the compiler. I find this way more readable and it seems that it is not an uncommon practice.
t.Parallel() | ||
ctx, cancel := context.WithCancel(context.Background()) | ||
defer cancel() | ||
server := framework.PrivateKcpServer(t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?
defer cancel() | ||
server := framework.PrivateKcpServer(t) | ||
|
||
// Create organization and workspace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need an org anymore? Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't. It does not hurt either
shardClient := kcpClusterClient.CoreV1alpha1().Shards() | ||
_, err = shardClient.Cluster(core.RootCluster.Path()).Create(ctx, shard1a, metav1.CreateOptions{}) | ||
require.NoError(t, err, "error creating shard") | ||
framework.Eventually(t, func() (bool, string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
framework.EventuallyCondition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did not exist when this test was written, does not allow to check values of multiple conditions.
return false, fmt.Sprintf("expected 1 partition, but got %d", partitionSet.Status.Count) | ||
}, wait.ForeverTestTimeout, 100*time.Millisecond, "expected the partition count to be 1") | ||
partitionClient := kcpClusterClient.TopologyV1alpha1().Partitions() | ||
var partitions *topologyv1alpha1.PartitionList |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please continue to t.Logf()
the steps
} | ||
shard2, err = shardClient.Cluster(core.RootCluster.Path()).Create(ctx, shard2, metav1.CreateOptions{}) | ||
require.NoError(t, err, "error creating shard") | ||
framework.Eventually(t, func() (bool, string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
eventuallycondition
partitions, err = partitionClient.Cluster(partitionClusterPath).List(ctx, metav1.ListOptions{}) | ||
require.NoError(t, err, "error retrieving partitions") | ||
if len(partitions.Items) == 2 { | ||
return true, "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to validate that we get the right ones ehre
I had proposed something similar in a past PR. I see you have relaunched a discussion in slack on that. The change in the approach is out of scope of this PR. |
4d12faa
to
b2c6fee
Compare
Sorry, I don't think we need to check in more tests using private fixtures. We're already at critical mass with e2e flakiness and adding much more load there is not reasonable. How hard is it to implement some "not-ready-for-scheduling" field on the shard? We're in alpha versions, can iterate. |
/hold |
b2c6fee
to
194c1b7
Compare
Great! Give me a shout when you agree on something. |
Sounds like a field |
FYI #2782 should unblock this PR. |
Signed-off-by: Frederic Giloux <[email protected]>
Signed-off-by: Frederic Giloux <[email protected]>
57f2bbc
to
41c2d83
Compare
This does not seem to be sufficient:
pushing a commit to fix it. I am not sure that there are no other issues. |
41c2d83
to
0e338c8
Compare
… by other e2e test Signed-off-by: Frederic Giloux <[email protected]>
c5ad38b
to
da8db7a
Compare
Signed-off-by: Frederic Giloux <[email protected]>
da8db7a
to
d6bdad4
Compare
/unhold |
@p0lyn0mial @stevekuznetsov I have refactored the tests to use a shared server. I have also added a commit to patch some sensitivity to non-schedulable shards within the APIBinding tests. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sttts The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Summary
This PR adds end-to-end tests for PartitionSet reconciliation.
Related issue(s)
Contributes to #2334