Skip to content

Refactoring retention of backup schedules #11223

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

bernardodemarco
Copy link
Member

Description

PR #10017 introduced, among other features, the ability to configure retention for backup schedules. This PR aims to fix some minor inconsistencies in that feature, refactor redundant workflows and behaviors, and improve code readability and maintainability by adding logs and more granular unit tests.

Below, I have listed all the points that this PR addresses.


Exposed scheduleid parameter

When scheduling a backup creation, the ID of the schedule responsible for the backup is passed as a parameter to the createBackup API:

final Long eventId = ActionEventUtils.onScheduledActionEvent(User.UID_SYSTEM, vm.getAccountId(),
EventTypes.EVENT_VM_BACKUP_CREATE, "creating backup for VM ID:" + vm.getUuid(),
vmId, ApiCommandResourceType.VirtualMachine.toString(),
true, 0);
final Map<String, String> params = new HashMap<String, String>();
params.put(ApiConstants.VIRTUAL_MACHINE_ID, "" + vmId);
params.put(ApiConstants.SCHEDULE_ID, "" + backupScheduleId);
params.put("ctxUserId", "1");
params.put("ctxAccountId", "" + vm.getAccountId());
params.put("ctxStartEventId", String.valueOf(eventId));
final CreateBackupCmd cmd = new CreateBackupCmd();
ComponentContext.inject(cmd);
apiDispatcher.dispatchCreateCmd(cmd, params);
params.put("id", "" + vmId);
params.put("ctxStartEventId", "1");

This parameter is intended solely for use by the scheduling workflow. End users should neither access nor be aware of it. If a backup is manually created while specifying a schedule ID, inconsistencies may occur later in the scheduling process. To prevent this, changes were made to ensure the parameter is not exposed to end users.


Relationship between cloud.backup and cloud.backup_schedule

To determine whether backups should be deleted to meet the retention requirements, the backup creation workflow must be able to identify whether the backup is scheduled and which schedule it belongs to. If the backup is scheduled, retention validation should be performed. On the other hand, if it is a manual backup, retention validation should be skipped.

Currently, the cloud.backup table uses the backup_interval_type column for this purpose:

@Column(name = "backup_interval_type")
private short backupIntervalType;

However, the cloud.backup_schedule table already contains a schedule_type column, leading to data redundancy.

@Column(name = "schedule_type")
private Short scheduleType;

Because of this, the backup creation logic requires back-and-forth conversions between DateUtil.IntervalType and org.apache.cloudstack.backup.Backup.Type. For example:

private Backup.Type getBackupType(Long scheduleId) {
if (scheduleId.equals(Snapshot.MANUAL_POLICY_ID)) {
return Backup.Type.MANUAL;
} else {
BackupScheduleVO scheduleVO = backupScheduleDao.findById(scheduleId);
DateUtil.IntervalType intvType = scheduleVO.getScheduleType();
return getBackupType(intvType);
}
}
private Backup.Type getBackupType(DateUtil.IntervalType intvType) {
if (intvType.equals(DateUtil.IntervalType.HOURLY)) {
return Backup.Type.HOURLY;
} else if (intvType.equals(DateUtil.IntervalType.DAILY)) {
return Backup.Type.DAILY;
} else if (intvType.equals(DateUtil.IntervalType.WEEKLY)) {
return Backup.Type.WEEKLY;
} else if (intvType.equals(DateUtil.IntervalType.MONTHLY)) {
return Backup.Type.MONTHLY;
}
return null;
}

Moreover, since the relationship is based solely on the interval type, deleting backups to comply with a schedule's retention policy may unintentionally remove backups from previously deleted schedules of the same type. For instance:

  1. Create an HOURLY schedule
  2. 10 backups are created from this schedule
  3. The schedule is deleted
  4. A new HOURLY schedule is created with a retention of 3
  5. On the next backup execution, 8 backups are deleted

This happens because the system currently treats all backups with the same interval type as belonging to the same schedule.

Therefore, to address this, the relationship between cloud.backup and cloud.backup_schedule was refactored by removing the schedule_type from the cloud.backup table, and adding the backup_schedule_id column.

This change makes it possible to associate each backup with a specific schedule, eliminating ambiguity and data redundancy. Manual backups will have a NULL value for backup_schedule_id, while scheduled backups will reference their respective schedule, from which the interval type can be determined.


Removal of maximum allowed retention configurations

The backup.max.hourly, backup.max.daily, backup.max.weekly, and backup.max.monthly settings are currently used to define the maximum retention values that end users can configure. These are validated during backup schedule creation, and an exception is thrown if the specified retention exceeds the configured maximum.

However, since administrators can already define backup and allocated backup storage limits per account and domain, these settings have limited practical use. This PR proposes removing them, allowing users to omit retention entirely. In this approach, backup limits are enforced solely through account and domain-level control limits.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI
  • test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Screenshots (if appropriate):

How Has This Been Tested?

  • Verified that manual backups are not associated with any schedule
  • Verified that for manual backups, the backup deletion flow is not executed after backups are created
  • Verified that when the retention is greater than 0 and the number of backups in a schedule is less than the retention, no backup is deleted
  • Verified that when the retention is greater than 0 and the number of backups in a schedule is greater than the retention, backups are correctly deleted
  • Verified that when the retention is increased, the updated retention value is respected
  • Verified that when the retention is reduced, the necessary number of backups is deleted to comply with the new retention value
  • Verified that when the backup schedule associated with a backup has a retention value of 0, the backup deletion flow is not executed
  • Deleted a schedule, created a new one with the same interval type, and verified that backups from the old schedule are not considered in the retention calculation

Copy link

codecov bot commented Jul 16, 2025

Codecov Report

❌ Patch coverage is 70.45455% with 26 lines in your changes missing coverage. Please review.
✅ Project coverage is 17.17%. Comparing base (6d5cefd) to head (2224342).
⚠️ Report is 67 commits behind head on main.

Files with missing lines Patch % Lines
...rg/apache/cloudstack/backup/dao/BackupDaoImpl.java 0.00% 9 Missing ⚠️
...rg/apache/cloudstack/backup/BackupManagerImpl.java 89.70% 3 Missing and 4 partials ⚠️
...in/java/org/apache/cloudstack/backup/BackupVO.java 0.00% 6 Missing ⚠️
...org/apache/cloudstack/backup/BackupScheduleVO.java 0.00% 3 Missing ⚠️
...stack/api/command/user/backup/CreateBackupCmd.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #11223      +/-   ##
============================================
+ Coverage     16.74%   17.17%   +0.42%     
- Complexity    14065    15010     +945     
============================================
  Files          5724     5869     +145     
  Lines        507787   521689   +13902     
  Branches      61733    63504    +1771     
============================================
+ Hits          85046    89611    +4565     
- Misses       413258   422019    +8761     
- Partials       9483    10059     +576     
Flag Coverage Δ
uitests 3.76% <ø> (-0.14%) ⬇️
unittests 18.15% <70.45%> (+0.48%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@bernardodemarco
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@bernardodemarco a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 14215

Copy link

This pull request has merge conflicts. Dear author, please fix the conflicts and sync your branch with the base branch.

@winterhazel
Copy link
Member

@blueorangutan package

@blueorangutan
Copy link

@winterhazel a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 14332

@DaanHoogland
Copy link
Contributor

@bernardodemarco @winterhazel is this ready (for review/testing etc)?

@winterhazel
Copy link
Member

@bernardodemarco @winterhazel is this ready (for review/testing etc)?

@DaanHoogland yes, this PR is ready for both review and testing.

@DaanHoogland
Copy link
Contributor

@blueorangutan test keepEnv

@DaanHoogland DaanHoogland self-assigned this Jul 25, 2025
@blueorangutan
Copy link

@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link

[SF] Trillian test result (tid-13892)
Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8
Total time taken: 59454 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr11223-t13892-kvm-ol8.zip
Smoke tests completed. 142 look OK, 0 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File

Copy link
Contributor

@JoaoJandre JoaoJandre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLGTM, but I left a minor nitpick

@bernardodemarco
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@bernardodemarco a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 14443

@bernardodemarco
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@bernardodemarco a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✖️ el8 ✖️ el9 ✖️ debian ✖️ suse15. SL-JID 14446

@sureshanaparti
Copy link
Contributor

@blueorangutan package

@blueorangutan
Copy link

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 14461

Copy link
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm, and limited testing done;

  • DB seems good after white box test.
  • schedule id parameter is removed and that seem ok by me.
  • configurations are gone as well. I wonder if these will lead to complaints by users? As they pay for backup storage space in some cases they may want to reduce the number of backup retained from the system configured one. (I don’t sell backups so I don’t have a strong opinion, just throwing it out there)

EDIT: silly to think users would use those configurations, never mind point 3.

Copy link
Contributor

@sureshanaparti sureshanaparti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

@sureshanaparti sureshanaparti merged commit f73cb56 into apache:main Jul 30, 2025
43 of 49 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Apache CloudStack 4.21.0 Jul 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

6 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy