Skip to content

Find system VM templates for CKS clusters and SharedFS honouring the preferred architecture #10946

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: 4.20
Choose a base branch
from

Conversation

nvazquez
Copy link
Contributor

@nvazquez nvazquez commented Jun 2, 2025

Description

This PR fixes the selection of system VM templates for CKS clusters honouring the setting system.vm.preferred.architecture

Fixes: #10944

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI
  • test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

Copy link
Member

@rohityadavcloud rohityadavcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM didn't test but code/path looks alright

Copy link

codecov bot commented Jun 2, 2025

Codecov Report

❌ Patch coverage is 24.32432% with 28 lines in your changes missing coverage. Please review.
✅ Project coverage is 16.16%. Comparing base (823080c) to head (c3b83a4).
⚠️ Report is 72 commits behind head on 4.20.

Files with missing lines Patch % Lines
...bernetes/cluster/KubernetesClusterManagerImpl.java 11.53% 21 Missing and 2 partials ⚠️
...r/actionworkers/KubernetesClusterActionWorker.java 0.00% 2 Missing ⚠️
.../java/com/cloud/storage/dao/VMTemplateDaoImpl.java 83.33% 0 Missing and 1 partial ⚠️
...KubernetesClusterResourceModifierActionWorker.java 0.00% 1 Missing ⚠️
...sharedfs/lifecycle/StorageVmSharedFSLifeCycle.java 50.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.20   #10946      +/-   ##
============================================
+ Coverage     16.14%   16.16%   +0.01%     
- Complexity    13253    13280      +27     
============================================
  Files          5656     5656              
  Lines        497893   497941      +48     
  Branches      60374    60388      +14     
============================================
+ Hits          80405    80489      +84     
+ Misses       408529   408489      -40     
- Partials       8959     8963       +4     
Flag Coverage Δ
uitests 4.00% <ø> (+<0.01%) ⬆️
unittests 17.01% <24.32%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@nvazquez nvazquez requested a review from shwstppr June 2, 2025 17:27
@nvazquez
Copy link
Contributor Author

nvazquez commented Jun 2, 2025

@blueorangutan package

@nvazquez
Copy link
Contributor Author

nvazquez commented Jun 2, 2025

@blueorangutan package

@blueorangutan
Copy link

@nvazquez a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 13580

@nvazquez
Copy link
Contributor Author

nvazquez commented Jun 2, 2025

@blueorangutan package

@blueorangutan
Copy link

@nvazquez a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 13581

@nvazquez
Copy link
Contributor Author

nvazquez commented Jun 2, 2025

@blueorangutan package

@blueorangutan
Copy link

@nvazquez a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@nvazquez nvazquez requested a review from abh1sar June 2, 2025 20:05
@nvazquez nvazquez changed the title Find system VM templates for CKS cluster honouring the preferred architecture Find system VM templates for CKS clusters and SharedFS honouring the preferred architecture Jun 2, 2025
@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 13582

@blueorangutan
Copy link

@nvazquez a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

Copy link
Member

@rohityadavcloud rohityadavcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add some check around CKS data ISO arch vs the instance arch? CKS & SharedFS should fail with an easy to understand error message when the instance arch mis-matches the CKS data iso arch.

@blueorangutan
Copy link

[SF] Trillian test result (tid-13453)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 55304 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10946-t13453-kvm-ol8.zip
Smoke tests completed. 141 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

Copy link
Contributor

@sureshanaparti sureshanaparti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

@@ -434,7 +434,8 @@ private IpAddress getSourceNatIp(Network network) {
}

public VMTemplateVO getKubernetesServiceTemplate(DataCenter dataCenter, Hypervisor.HypervisorType hypervisorType) {
VMTemplateVO template = templateDao.findSystemVMReadyTemplate(dataCenter.getId(), hypervisorType);
ConfigKey<String> preferredArchitecture = ResourceManager.SystemVmPreferredArchitecture;
VMTemplateVO template = templateDao.findSystemVMReadyTemplate(dataCenter.getId(), hypervisorType, preferredArchitecture.value());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nvazquez we need to use valueIn() method with zoneid for the config value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks @harikrishna-patnala


for (final Iterator<Hypervisor.HypervisorType> iter = hypervisors.iterator(); iter.hasNext();) {
final Hypervisor.HypervisorType hypervisor = iter.next();
VMTemplateVO template = templateDao.findSystemVMReadyTemplate(zoneId, hypervisor);
VMTemplateVO template = templateDao.findSystemVMReadyTemplate(zoneId, hypervisor, preferredArchitecture.value());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, valueIn() with zoneId value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks @harikrishna-patnala

@rohityadavcloud
Copy link
Member

@nvazquez can you address the outstanding questions & remarks?

@nvazquez
Copy link
Contributor Author

Thanks @harikrishna-patnala @rohityadavcloud - I have addressed your comments, can you please re-review?
@rohityadavcloud I've added check to compare suitable hosts for the cluster deployment against the selected template arch, so if a host doesn't match the template arch then it will be not considered

@blueorangutan package

@blueorangutan
Copy link

@nvazquez a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 13816

@nvazquez
Copy link
Contributor Author

nvazquez commented Jun 18, 2025

@rohityadavcloud @harikrishna-patnala wouldn't it be better to filter out the templates that do not match the CKS ISO architecture instead of sorting them by the prefered arch setting? Anyways, I'm adding the check for CKS ISO arch vs selected template arch and failing with a proper message in case of mismatch

@nvazquez
Copy link
Contributor Author

@blueorangutan package

1 similar comment
@nvazquez
Copy link
Contributor Author

@blueorangutan package

@blueorangutan
Copy link

@nvazquez a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 14138

@DaanHoogland DaanHoogland removed their request for review July 14, 2025 07:57
@DaanHoogland
Copy link
Contributor

@blueorangutan test

@blueorangutan
Copy link

@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-13765)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 57877 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr10946-t13765-kvm-ol8.zip
Smoke tests completed. 141 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

Copy link
Member

@weizhouapache weizhouapache left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code lgtm

@weizhouapache weizhouapache self-requested a review July 15, 2025 04:18
String err = String.format("The selected Kubernetes ISO %s arch (%s) doesn't match the template %s arch (%s) " +
"to deploy the Kubernetes cluster",
clusterKubernetesVersion.getName(), cksIso.getArch(), finalTemplate.getName(), finalTemplate.getArch());
throw new CloudRuntimeException(err);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider the arch of cks iso before preferred arch of zone ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could also make sense, what do you think @harikrishna-patnala @sureshanaparti @shwstppr ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that makes sense - pl address that @nvazquez - then it's ready for merging.

Copy link
Member

@rohityadavcloud rohityadavcloud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM tested in an mixed-arch zone with NUC9 x86 hosts and RPi5/arm64 16GB hosts.

@nvazquez
Copy link
Contributor Author

Thanks for testing @rohityadavcloud - I have pushed a fix to address the comment

@blueorangutan package

@blueorangutan
Copy link

@nvazquez a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 14478

@nvazquez
Copy link
Contributor Author

@blueorangutan test

@blueorangutan
Copy link

@nvazquez a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CKS data ISO & VM/template arch mis-match for both CKS & SharedFS features
7 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy