Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(misconf): Add scanner option to avoid Rego initialization #8376

Open
simar7 opened this issue Feb 8, 2025 · 8 comments
Open

feat(misconf): Add scanner option to avoid Rego initialization #8376

simar7 opened this issue Feb 8, 2025 · 8 comments
Assignees
Labels
scan/misconfiguration Issues relating to misconfiguration scanning

Comments

@simar7
Copy link
Member

simar7 commented Feb 8, 2025

Currently initialization of a scanner looks as follows:

        so := p.scannerOptions(checksDir, dataPaths, dataFS, hasChecks)
	scanner := kubernetes.NewScanner(so...)
	scanResult, err := scanner.ScanFS(ctx, memfs, inputFolder)
	if err != nil {
		return nil, err
	}

Today we don't provide a way for the user of the scanner to avoid a re-initialization of all the checks. While this isn't necessarily a problem per-se, if used incorrectly, can lead to scanners initializing the checks on each call. Initialization of checks is expensive and does not need to be performed each time but instead only once.

An example of an incorrect usage is here https://github.com/aquasecurity/trivy-operator/blob/f6d43a84892a0ccc56119a8eedd39ec0cfd9539d/pkg/policy/policy.go#L226-L231

Having a scanner option that can skip the initialization of checks can greatly speed up such call sites where the usage of the scanner is incorrectly configured, without having to change the call site architecture. The iac pkg can implement the logic to avoid another initialization if the option is passed and is set to true.

@simar7 simar7 added the scan/misconfiguration Issues relating to misconfiguration scanning label Feb 8, 2025
@simar7 simar7 added this to the v0.60.0 milestone Feb 8, 2025
@simar7
Copy link
Member Author

simar7 commented Feb 8, 2025

A related issue to this aquasecurity/trivy-operator#2427 where wrongly initializing the scanner can lead to increased memory pressure.

@itaysk
Copy link
Contributor

itaysk commented Feb 9, 2025

If I understand correctly (I probably didn't), the scanner in question is the iac/rego scanner and the "initialization" is loading the rego rules? if so, it appears to be a sync.Once hence wouldn't re-initialize even if caller tried to?

var LoadAndRegister = sync.OnceFunc(func() {

@nikpivkin
Copy link
Contributor

nikpivkin commented Feb 10, 2025

Actually, the performance issue is due to compiling all the checks (built-in and custom) when scanning each K8s resource. Each scan loads external policies based on resource kind, preventing them from being compiled once before scanning the entire cluster. In addition, OPA does not support incremental updating and deletion of policies.

One solution is to change the internal implementation of the Rego scanner to support multiple compilers. This would allow us to pre-load and compile all of Trivy's built-in checks (over 400), and then dynamically add and remove additional checks while scanning a particular resource.

I haven't looked deeply into the trivy-operator code, but there may be alternative ways to solve the problem there.

Example of use:

// Load built-in checks before scanning the cluster
rs := rego.NewScanner(opts)
rs.LoadPolicies(fsys)

...
// Eval

// Load checks based on kind resource
resourceKind := resource.GetObjectKind().GroupVersionKind().Kind
policies, _ := p.loadPolicies(resourceKind)

// Create a file system in memory and load external checks
memfs := memoryfs.New()
reatePolicyInputFS(memfs, policiesFolder, policies, regoExt)

// Load external checks and create a runner
runnerIdx, _ := rs.LoadPolicies(fsys)
// Remove the runner with external checks after scanning the resource
defer rs.RemoveRunner(runnerIdx)

// Pass Rego scanner to kubernetes scanner
so = appned(so, options.WithRegoScanner(rs))
scanner := kubernetes.NewScanner(so...)
scanResult, err := scanner.ScanFS(ctx, memfs, inputFolder)

@simar7
Copy link
Member Author

simar7 commented Feb 10, 2025

A related issue to this aquasecurity/trivy-operator#2427 where wrongly initializing the scanner can lead to increased memory pressure.

As I mentioned here, this is a problem in trivy-operator due to bad design of initializing the scanner. There's a few ways to solve it, one being changing the design of the operator (initializing once per runtime vs what it does today, which is initializing once per workload). Another option is to do what's proposed here, without changing the operator design. The latter option is takes less work as we don't have to rearchitect the design of the operator and can also be used any other users of the iac pkg.

If I understand correctly (I probably didn't), the scanner in question is the iac/rego scanner and the "initialization" is loading the rego rules? if so, it appears to be a sync.Once hence wouldn't re-initialize even if caller tried to?

The operator initializes a new scanner per workload and doesn't share them. This is problematic as we have to initialize a new scanner per workload, which is very expensive. What you are referencing is what happens if the same scanner was tried to initialize more than once. In this case, sync.Once does what it's supposed to and avoids any additional initializations.

@itaysk
Copy link
Contributor

itaysk commented Feb 10, 2025

(initializing once per runtime vs what it does today, which is initializing once per workload). Another option is to do what's proposed here, without changing the operator design.

your suggestion is for the client (operator) to initialize the scanner without reloading checks right? In this case, when/how the checks are getting loaded? I presume the client would need to make the scanner load checks beforhand? If that's so is it not the same as "initializing once per runtime"?

OPA does not support incremental updating and deletion of policies

I think it's fine to restart the scanner whenever there's a checks bundle update

@itaysk
Copy link
Contributor

itaysk commented Feb 10, 2025

If that's so is it not the same as "initializing once per runtime"?

we had a call, to summarize, your suggestion is to keep calling the newscanner function on every workload, but internally it will reuses the same scanner instance vs move newscanner call to once per runtime, and instead use eval on every workload.

I think the latter (newscanner once per runtime) is better, as it doesn't change the newsacnner function behavior in a way that doesn't exactly align with it's name, or with other sacanners behavior. I don't know about the cost of this change but if it's alot of work, and you prefer the former easier option, I can propose a slight variation: Instead of changing the scanner implementaion, move the change to the operator side - replace the call to newscanner with client-side wrapper, which itself calls newscanner if needed while maintaining the scanner singleton in the operator. this way the fix is contained to the consumer.

@nikpivkin
Copy link
Contributor

@simar7 trivy-operator loads checks based on resource kind: https://github.com/aquasecurity/trivy-operator/blob/f6d43a84892a0ccc56119a8eedd39ec0cfd9539d/pkg/policy/policy.go#L197-L198. Is it possible to extract this from the Eval function and load the checks only once?

@simar7
Copy link
Member Author

simar7 commented Feb 11, 2025

@simar7 trivy-operator loads checks based on resource kind: https://github.com/aquasecurity/trivy-operator/blob/f6d43a84892a0ccc56119a8eedd39ec0cfd9539d/pkg/policy/policy.go#L197-L198. Is it possible to extract this from the Eval function and load the checks only once?

Yes that's what we discussed. I'm already working on it aquasecurity/trivy-operator@main...perf/init-scanner-once

@simar7 simar7 removed this from the v0.60.0 milestone Feb 18, 2025
@simar7 simar7 removed this from Trivy Roadmap Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scan/misconfiguration Issues relating to misconfiguration scanning
Projects
None yet
Development

No branches or pull requests

3 participants