Skip to content

Allow parallel execution of NAS backup and delete commands#12847

Open
jmsperu wants to merge 1 commit intoapache:4.20from
jmsperu:fix/nasbackup-parallel-execution
Open

Allow parallel execution of NAS backup and delete commands#12847
jmsperu wants to merge 1 commit intoapache:4.20from
jmsperu:fix/nasbackup-parallel-execution

Conversation

@jmsperu
Copy link

@jmsperu jmsperu commented Mar 17, 2026

Summary

  • Change executeInSequence() from true to false in TakeBackupCommand and DeleteBackupCommand
  • Allows the KVM agent to process multiple backup/delete operations concurrently via its existing worker thread pool
  • RestoreBackupCommand and PrepareForBackupRestorationCommand remain sequential (they modify VM state)

Motivation

Currently all backup commands are serialized on the agent — a large VM backup (e.g. 100+ GB taking 2+ hours) blocks all other backup and delete operations on the same host. This is the root cause of backup schedule delays and timeouts in environments with many VMs per host.

Each backup operation:

  • Mounts its own temporary NFS directory (mktemp -d -t csbackup.XXXXX)
  • Operates on independent VM disks via separate QEMU block jobs
  • Has no shared state with other backup operations

There is no technical reason to serialize them. The agent already has a thread pool (requestHandler) that can execute multiple commands concurrently — this change simply allows backup commands to use it.

Impact

  • Hosts with 10+ VMs will see significantly faster backup completion (backups run in parallel instead of queuing)
  • NFS bandwidth is shared across concurrent backups (can be controlled with the bandwidth throttle flag from PR nasbackup.sh: add bandwidth throttle via -b flag #12846)
  • No change to management server scheduling — it already submits backups as independent async jobs

Test plan

  • Schedule 3+ VM backups at the same time on one host — verify they run concurrently (check virsh domjobinfo on multiple VMs)
  • Verify each backup gets its own mount point (no mount conflicts)
  • Run backup + delete concurrently — verify no interference
  • Verify restore operations still execute sequentially
  • Monitor host I/O during concurrent backups — consider using -b bandwidth throttle if NFS saturates

Change executeInSequence() to return false for TakeBackupCommand
and DeleteBackupCommand, allowing the KVM agent to process multiple
backup/delete operations concurrently via its worker thread pool.

Previously, all backup commands were serialized — a large VM backup
(e.g. 100+ GB taking 2+ hours) would block all other backup and
delete operations on the same host. Since each backup mounts its
own temporary NFS directory and operates on independent VM disks,
there is no shared state requiring serialization.

Restore and PrepareForBackupRestoration commands remain sequential
as they modify VM state that should not be concurrent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant