Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ACM-15078: Fixing agents reuse problem in z/VM after reclaim #817

Merged
merged 3 commits into from
Oct 31, 2024

Conversation

veera-damisetti
Copy link
Contributor

  • Updated bootLoaderConfigTemplateS390x to add ai.ip_cfg_override to take care of IP and nameserver parameters , which is a mandatory parameter in z/VM as mac address is not persistent.
  • Updated paramaters which are being passed to zipl , to include parameters from /proc/cmdline which will help in reusing the agents after reclaim.

…meserver persistent for agents

Signed-off-by: DAMISETTI-VEERABHADRARAO <[email protected]>
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 28, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 28, 2024

@veera-damisetti: This pull request references ACM-15078 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.18.0" version, but no target version was set.

In response to this:

  • Updated bootLoaderConfigTemplateS390x to add ai.ip_cfg_override to take care of IP and nameserver parameters , which is a mandatory parameter in z/VM as mac address is not persistent.
  • Updated paramaters which are being passed to zipl , to include parameters from /proc/cmdline which will help in reusing the agents after reclaim.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 28, 2024
Copy link

openshift-ci bot commented Oct 28, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Oct 28, 2024
@veera-damisetti veera-damisetti marked this pull request as ready for review October 28, 2024 04:54
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 28, 2024
@openshift-ci openshift-ci bot requested review from avishayt and jhernand October 28, 2024 04:54
Copy link

codecov bot commented Oct 28, 2024

Codecov Report

Attention: Patch coverage is 33.33333% with 24 lines in your changes missing coverage. Please review.

Project coverage is 59.50%. Comparing base (1c209d5) to head (478b87e).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
src/commands/actions/reboot_for_reclaim.go 33.33% 24 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #817      +/-   ##
==========================================
- Coverage   59.70%   59.50%   -0.21%     
==========================================
  Files          74       74              
  Lines        3809     3842      +33     
==========================================
+ Hits         2274     2286      +12     
- Misses       1367     1388      +21     
  Partials      168      168              
Files with missing lines Coverage Δ
...rc/commands/actions/download_boot_artifacts_cmd.go 19.35% <ø> (ø)
src/commands/actions/reboot_for_reclaim.go 30.00% <33.33%> (+7.77%) ⬆️

Copy link
Contributor

@CrystalChun CrystalChun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 28, 2024
@CrystalChun
Copy link
Contributor

Would be great to see how and why these additional parameters are needed specifically for Z in order for the hosts to be reusable! Maybe a doc or blog post would be the route to show everyone?

@veera-damisetti
Copy link
Contributor Author

veera-damisetti commented Oct 28, 2024

Sure @CrystalChun , thanks for the review

In order to reuse the hosts, hosts should be booted with proper/desired network and storage configurations.

In case of Z ( z/VM) ,

  • IP and nameserver should be added to cmdline parameters , as using NMState is not possible, because mac address is not persistent in z/VM.
  • Network configurations should be included in cmdline parameters, using rd.znet and ip through which, we can give desired network cards for connectivity.
  • Similarly, we should give storage configuraions also with rd.zfcp,rd.dasd and zfcp.allow_lun_scan, depends on the type of the disk which we want to configure for z/VM.

HCP IBMZ doc for z/VM for cmdline parameter reference: https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.11/html/clusters/cluster_mce_overview#hosted-bare-metal-adding-agents-ibmz-zvm-lpar

RH doc for explaining more details about each param: https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/7/html/installation_guide/chap-installer-booting-ipl-s390#chap-installer-booting-ipl-s390

@veera-damisetti
Copy link
Contributor Author

/assign @eifrach

Comment on lines 64 to 74
for _, param := range paramsToExtract {
regex := regexp.MustCompile(fmt.Sprintf(`\b%s=([^\s]+)`, param))
match := regex.FindStringSubmatch(stdout)
if len(match) > 1 {
cmdline_params[param] = match[1]
}
}

for key, value := range cmdline_params {
requiredCmdline += fmt.Sprintf("%s=%s ", key, value)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should create a function for this and a unit test.

once we start to use regex it can break in future changes + make this function more readable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure @eifrach
Do you want me to write entire s390x logic in a seperate function and unit test ? or just this fetching parameters using regexp as a seperate function ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes just the regex / paramter fatching. cause we do want make sure that changes wont break it later on

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Comment on lines 65 to 66
regex := regexp.MustCompile(fmt.Sprintf(`\b%s=([^\s]+)`, param))
match := regex.FindStringSubmatch(stdout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if we don't have a match ?

Copy link
Contributor Author

@veera-damisetti veera-damisetti Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there’s no match for a parameter in stdout, the current code will not assign those parameters, so cmdline_params[param] remains unset.
and this should be the expected behaviour, if paramsToExtract are not present in /proc/cmdline

Copy link
Contributor

@eifrach eifrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is the unit-test passed without a change?
we are missing here a lot for testing to this logic

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Oct 29, 2024
… test case for the same

Signed-off-by: DAMISETTI-VEERABHADRARAO <[email protected]>
@veera-damisetti
Copy link
Contributor Author

/test edge-e2e-metal-assisted

@veera-damisetti
Copy link
Contributor Author

/test edge-subsystem-test

Copy link

openshift-ci bot commented Oct 31, 2024

@veera-damisetti: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@veera-damisetti
Copy link
Contributor Author

@eifrach , thanks for the review.

I did changes to have a separate function for the logic and added unit tests for the same, and
Please have a look.

@eifrach
Copy link
Contributor

eifrach commented Oct 31, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 31, 2024
@eifrach
Copy link
Contributor

eifrach commented Oct 31, 2024

/approve

Copy link

openshift-ci bot commented Oct 31, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: CrystalChun, eifrach, veera-damisetti

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 31, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit d599caa into openshift:master Oct 31, 2024
12 checks passed
@veera-damisetti
Copy link
Contributor Author

Thanks @CrystalChun @eifrach , for the quick reviews.

@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

Distgit: ose-agent-installer-node-agent
This PR has been included in build ose-agent-installer-node-agent-container-v4.18.0-202410311509.p0.gd599caa.assembly.stream.el9.
All builds following this will include this PR.

@carbonin
Copy link
Member

carbonin commented Feb 6, 2025

/cherry-pick release-ocm-2.12

@openshift-cherrypick-robot

@carbonin: new pull request created: #908

In response to this:

/cherry-pick release-ocm-2.12

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@carbonin
Copy link
Member

carbonin commented Feb 6, 2025

/cherry-pick release-ocm-2.11

@openshift-cherrypick-robot

@carbonin: #817 failed to apply on top of branch "release-ocm-2.11":

Applying: ACM-15078: Updated bootLoaderConfigTemplateS390x for making IP and nameserver persistent for agents
Using index info to reconstruct a base tree...
M	src/commands/actions/download_boot_artifacts_cmd.go
Falling back to patching base and 3-way merge...
Auto-merging src/commands/actions/download_boot_artifacts_cmd.go
CONFLICT (content): Merge conflict in src/commands/actions/download_boot_artifacts_cmd.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config advice.mergeConflict false"
Patch failed at 0001 ACM-15078: Updated bootLoaderConfigTemplateS390x for making IP and nameserver persistent for agents

In response to this:

/cherry-pick release-ocm-2.11

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants