-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement a way to time out tokens based on (in)activity #17640
Conversation
/hold |
990e955
to
7026f96
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is the diffs I used:
http://pastebin.test.redhat.com/538893 (for everything but the main timeout validator file)
https://www.diffchecker.com/ln16YiA7 (for the validator file)
AccessTokenTimeoutSeconds: &invalid, | ||
GrantMethod: oauthapi.GrantHandlerAuto, | ||
}) | ||
if err == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check for IsInvalid
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -17,6 +17,7 @@ import ( | |||
) | |||
|
|||
const MinTokenLength = 32 | |||
const MinFlushTimeout = 150 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am actually going to change flush timeout handling a little, and will put here docs on what it means
@@ -81,6 +82,11 @@ func ValidateAccessTokenUpdate(newToken, oldToken *oauthapi.OAuthAccessToken) fi | |||
allErrs := validation.ValidateObjectMetaUpdate(&newToken.ObjectMeta, &oldToken.ObjectMeta, field.NewPath("metadata")) | |||
copied := *oldToken | |||
copied.ObjectMeta = newToken.ObjectMeta | |||
// allow only TimeoutsIn to be changed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of my validation additions are missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I think I have it now
timeoutValidator := authnregistry.NewTimeoutValidator(accessTokenGetter, oauthClientLister, *config.OAuthConfig.TokenConfig.AccessTokenTimeoutSeconds) | ||
validators = append(validators, timeoutValidator) | ||
postStartHooks["openshift.io-TokenTimeoutUpdater"] = func(context genericapiserver.PostStartHookContext) error { | ||
go timeoutValidator.Start(context.StopCh) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really like that we have to export the type to call this, but we seem to do this in other places as well so I suppose its fine (granted controllers seem to have a Run
method instead of a Start
).
I think we want a defer utilruntime.HandleCrash()
in the hook though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, changed Start() -> Run() and added a HandleCrash call in it
} | ||
} | ||
|
||
glog.Infof("Flushed %d tokens out of %d in bucket", flushedTokens, totalTokens) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
V(5)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
// make sure to kill the timer when we exit | ||
defer closeTimer() | ||
|
||
updateTimerAndFlush := func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can reset the timer to save resources and this way the timer is always stooped even when we receive from the channel (I could not tell from the docs if that was required).
resetTimerAndFlush := func() {
closeTimer()
nextTimer.Reset(a.flushTimeout)
nextTimeout = time.Now().Add(a.flushTimeout)
a.flush(nextTimeout.Add(a.safetyMargin))
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, done
delta := a.clientTimeout(td.token.ClientName) | ||
newTimeout := int32(0) | ||
if delta > 0 { | ||
newTimeout = int32((td.seen.Sub(td.token.CreationTimestamp.Time) + delta) / time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had some comments explaining this math which I think are useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
if oauthClient.AccessTokenTimeoutSeconds == nil { | ||
return a.defaultTimeout | ||
} | ||
a.updateTimeouts(oauthClient) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I had some extra glogs in this func and some other places in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not import all of them as some things simply changed, any specific thing you want to generate logs out of ? You can describe the thought process about what you think should be logged too.
switch { | ||
case err == nil: | ||
flushedTokens++ | ||
case apierrors.IsConflict(err): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combine the two error cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
glog.V(5).Infof("Token timeout for user=%q client=%q scopes=%v was not updated: %v", | ||
td.token.UserName, td.token.ClientName, td.token.Scopes, err) | ||
} else { | ||
flushedTokens++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also count total failures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any token not flushed is a failure, do we need to be very explicit about it ?
Ok I should have addressed all comments but the clock one. |
Finally #17662 got in and I was able to properly generate the internal API changes, code has been updated |
"github.com/openshift/origin/pkg/user/apis/user" | ||
) | ||
|
||
var ErrTimedout = errors.New("token timed out") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I believe this can be made private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
@@ -102,6 +103,16 @@ func ValidateOAuthConfig(config *api.OAuthConfig, fldPath *field.Path) Validatio | |||
))) | |||
} | |||
|
|||
if config.TokenConfig.AccessTokenTimeoutSeconds != nil { | |||
if *config.TokenConfig.AccessTokenTimeoutSeconds < oauthvalidation.ReferenceTimeoutInterval { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*config.TokenConfig.AccessTokenTimeoutSeconds == 0 is valid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
// to timeout earlier than they should. 0 is the exception, as this value | ||
// indicates that the token does not expire, and it is an allowed new | ||
// value. | ||
if newToken.TimeoutsIn > 0 && newToken.TimeoutsIn < oldToken.TimeoutsIn { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had some related validation in pkg/oauth/registry/oauthaccesstoken/strategy.go
that went with this as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I saw that, but given we can update the field with any (valid) value, I do not see much value in adding this kind of validation just at token creation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I suppose that is fine.
type tokenData struct { | ||
token *oauth.OAuthAccessToken | ||
seen time.Time | ||
} | ||
|
||
func (a *tokenData) timeout() time.Time { | ||
func (a tokenData) timeout() time.Time { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably want to use *tokenData to avoid copying the struct unnecessarily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was doing that before, but when I added RankedSet the compiler complained, changing it to not be apointer made it work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right because you will have to update everything to use *tokenData
otherwise it will not implement the interface.
@@ -102,6 +103,17 @@ func ValidateOAuthConfig(config *api.OAuthConfig, fldPath *field.Path) Validatio | |||
))) | |||
} | |||
|
|||
if config.TokenConfig.AccessTokenTimeoutSeconds != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if timeout := config.TokenConfig.AccessTokenTimeoutSeconds; timeout != nil && *timeout != 0 {
is a bit easier to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
fmt.Sprintf("cannot beless than the current value=%d", oldToken.TimeoutsIn))) | ||
} | ||
// negative values are not allowed either | ||
if newToken.TimeoutsIn < 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want this on create as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed
@@ -139,6 +165,15 @@ func ValidateClient(client *oauthapi.OAuthClient) field.ErrorList { | |||
allErrs = append(allErrs, ValidateScopeRestriction(restriction, field.NewPath("scopeRestrictions").Index(i))...) | |||
} | |||
|
|||
if client.AccessTokenTimeoutSeconds != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Want negative check here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And same checks on update too I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah about negative, update automatically checks for anything doen at create
@simo5 please add unit tests for all the validation bits, should be easy to extend/copy existing tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we want to log the error in these places since it is likely to leak the token value.
case err == nil: | ||
flushedTokens++ | ||
case apierrors.IsConflict(err) || apierrors.IsServerTimeout(err): | ||
glog.V(5).Infof("Token update deferred for token=%q, retriable error: %v", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Err actually I see you are logging the token name, do not do that xD
td.token.Name, err) | ||
retryList = append(retryList, *td) | ||
default: | ||
glog.V(5).Infof("Token timeout for user=%q client=%q scopes=%v was not updated: %v", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no token name here
for _, td := range retryList { | ||
err := a.update(td) | ||
if err != nil { | ||
glog.V(5).Infof("Token timeout for user=%q client=%q scopes=%v was not updated: %v", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
neither here, is the err what you want me drop ? (doing that anyway)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly nits
|
||
var errTimedout = errors.New("token timed out") | ||
|
||
// Implements rankedset.Item |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: type assertion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
return nil | ||
} | ||
|
||
td := tokenData{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't you just use a pointer the whole time? The channel having tokenData and the set having *tokenData is weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whatever :)
if *timeout != 0 && *timeout < oauthvalidation.ReferenceTimeoutInterval { | ||
validationResults.AddErrors(field.Invalid( | ||
fldPath.Child("tokenConfig", "accessTokenTimeoutSeconds"), | ||
*config.TokenConfig.AccessTokenInactivityTimeoutSeconds, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use *timeout on this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
pkg/oauth/apiserver/auth.go
Outdated
@@ -99,7 +99,11 @@ func (c *OAuthServerConfig) WithOAuth(handler http.Handler, requestContextMapper | |||
glog.Fatal(err) | |||
} | |||
|
|||
storage := registrystorage.New(c.ExtraOAuthConfig.OAuthAccessTokenClient, c.ExtraOAuthConfig.OAuthAuthorizeTokenClient, combinedOAuthClientGetter, registry.NewUserConversion()) | |||
tokentimeout := int32(0) | |||
if c.ExtraOAuthConfig.Options.TokenConfig.AccessTokenInactivityTimeoutSeconds != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: if timeout := c.ExtraOAuthConfig.Options.TokenConfig.AccessTokenInactivityTimeoutSeconds; timeout != nil
would make it so you do not have to repeat that whole thing on the next line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
@@ -20,6 +20,9 @@ import ( | |||
) | |||
|
|||
var timeout = flag.Duration("sub.timeout", 5*time.Minute, "Specify the timeout for each sub test") | |||
var timeoutException = map[string]*time.Duration{ | |||
"TestOAuthTimeout": flag.Duration("oauth.timeout", 15*time.Minute, "Timeout for the OAuth test"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: just make this a map[string]string, don't need the flag / pointer stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
easier this way for me
// Use the server and CA info | ||
anonConfig := restclient.AnonymousClientConfig(clientConfig) | ||
|
||
min := int32(150) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uh isn't this too small per validation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it passes tests ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hint the min timeout is reference timeout / 3 ie 100 seconds is the current absolute minimum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
glog comments.
} | ||
} | ||
|
||
glog.V(5).Infof("Successfully flushed %d tokens out of %d", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want a.data.Len() as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, I rally only want to show what we tried to flush, and what succeeded, data.Len() is all the stuff in the queue
err := a.update(*td) | ||
switch { | ||
case err == nil: | ||
flushedTokens++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A glog here may be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why ?
glog.V(5).Infof("Token timeout for user=%q client=%q scopes=%v was not updated: %v", | ||
td.token.UserName, td.token.ClientName, td.token.Scopes, err) | ||
} else { | ||
flushedTokens++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A glog here may be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no
return a.defaultTimeout | ||
} | ||
if oauthClient.AccessTokenInactivityTimeoutSeconds == nil { | ||
return a.defaultTimeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A glog here may be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why ?
if oauthClient.AccessTokenInactivityTimeoutSeconds == nil { | ||
return a.defaultTimeout | ||
} | ||
return timeoutAsDuration(*oauthClient.AccessTokenInactivityTimeoutSeconds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A glog here may be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uh ?
@@ -79,8 +94,24 @@ func ValidateAccessToken(accessToken *oauthapi.OAuthAccessToken) field.ErrorList | |||
|
|||
func ValidateAccessTokenUpdate(newToken, oldToken *oauthapi.OAuthAccessToken) field.ErrorList { | |||
allErrs := validation.ValidateObjectMetaUpdate(&newToken.ObjectMeta, &oldToken.ObjectMeta, field.NewPath("metadata")) | |||
// since TimeoutsIn can be concurrently updated by multipe masters we do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should error if oldToken.TimeoutsIn == 0 && newToken.TimeoutsIn != 0
to prevent a token that was never supposed to timeout from being updated to timeout. Probably needs a comment describing how this race could happen between multiple masters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
6cd0dc2
to
c8badfe
Compare
flake: #17974 |
flake: #17731 |
flake: #16484 |
/retest |
/lgtm Manually approving the apps changes since they are generated code. David did not seem too concerned. |
[APPROVALNOTIFIER] This PR is APPROVED Approval requirements bypassed by manually added approval. This pull-request has been approved by: enj, simo5 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/retest |
4 similar comments
/retest |
/retest |
/retest |
/retest |
flake: #17985 |
/retest |
1 similar comment
/retest |
flake: #16994 |
/retest |
flake: #17697 |
/retest |
1 similar comment
/retest |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest |
flake: #15630 |
@simo5: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest |
/test all [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue. |
When OAuthClient is configure with accessTokenTimeoutSeconds then tokens obtained with the specific client get a TimeoutsIn field that marks when the token is to be considered timed out.
The timeout is in seconds since token CreationTimestamp
As the token is used it is pushed into a bucket which is regularly flushed (for now).
This replaces #17161 which github inexplicably and unilaterally decided to close after a wrong push (can't be reopened)