nasbackup.sh: add timeout, cleanup trap, space check, quiesce support#12843
nasbackup.sh: add timeout, cleanup trap, space check, quiesce support#12843jmsperu wants to merge 1 commit intoapache:4.20from
Conversation
…ror handling - Add BACKUP_TIMEOUT (default 6h) to prevent indefinitely stuck backup jobs; aborts via domjobabort when exceeded - Add EXIT trap with cleanup() that resumes paused VMs, removes temp dirs, and unmounts NFS on any exit (error, signal, or normal) - Add check_free_space() pre-flight check (default 1 GB minimum) - Add -q/--quiesce flag for optional fsfreeze/thaw via qemu-guest-agent - Use set -eo pipefail for stricter error handling - Fix mount_operation: proper if/then instead of broken $? after pipe - Quote all variable expansions to prevent word splitting - Remove manual umount/rmdir from functions (handled by trap) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1728663 to
937e646
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## 4.20 #12843 +/- ##
=========================================
Coverage 16.24% 16.25%
- Complexity 13411 13412 +1
=========================================
Files 5664 5664
Lines 500463 500463
Branches 60779 60779
=========================================
+ Hits 81308 81333 +25
+ Misses 410059 410035 -24
+ Partials 9096 9095 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@jmsperu can you please check on 4.21? |
|
Summary
BACKUP_TIMEOUT(default 6 hours) to prevent indefinitely stuck backup jobs; aborts viadomjobabortwhen exceededEXITtrap withcleanup()that resumes paused VMs, removes temp dirs, and unmounts NFS on any exit (error, signal, or normal) — prevents orphan mounts from accumulatingcheck_free_space()pre-flight check (default 1 GB minimum) before writing backup data-q/--quiesceflag for optionalfsfreeze/thawvia qemu-guest-agent for application-consistent backupsset -eo pipefailfor stricter error handlingmount_operation(): properif mountinstead of broken$?check after pipeumount/rmdirfrom functions (now handled by trap)Motivation
The current
nasbackup.shhas several reliability issues observed in production:until domjobinfo --completed) runs forever if backup stalls, blocking the agent/tmp/csbackup.XXXXX) and paused VMs that never resume (related: KVM NAS backup: VM remains paused indefinitely when backup job fails #12821)$?after a pipe always returns the pipe's exit status, not the command'sTest plan
BACKUP_TIMEOUT=30— verify timeout anddomjobabort-qflag and qemu-guest-agent installed — verify fsfreeze/thaw in logs-qflag without qemu-guest-agent — verify graceful fallback