KVM: Enable HA heartbeat on ShareMountPoint#12773
KVM: Enable HA heartbeat on ShareMountPoint#12773weizhouapache wants to merge 4 commits intoapache:4.22from
Conversation
296835a to
93b95aa
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## 4.22 #12773 +/- ##
=========================================
Coverage 17.61% 17.61%
- Complexity 15661 15665 +4
=========================================
Files 5917 5917
Lines 531402 531414 +12
Branches 64971 64972 +1
=========================================
+ Hits 93586 93597 +11
- Misses 427262 427263 +1
Partials 10554 10554
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@blueoranguran package |
There was a problem hiding this comment.
Pull request overview
Enhances KVM HA heartbeat to support SharedMountPoint primary storage by introducing a new heartbeat script and extending HA support checks across KVM storage and monitor components.
Changes:
- Added a new KVM “SharedMountPoint” heartbeat script (
kvmsmpheartbeat.sh) that writes heartbeat files to a mountpoint. - Expanded HA-supported storage pool types to include
SharedMountPointin driver, monitor, and pool logic. - Added a
setTypehook toKVMStoragePooland set the pool type after creation / retrieval to ensure HA detection works.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/vm/hypervisor/kvm/kvmsmpheartbeat.sh | New local mountpoint-based heartbeat writer/checker script for SharedMountPoint HA. |
| plugins/storage/volume/default/.../CloudStackPrimaryDataStoreDriverImpl.java | Allows HA support for SharedMountPoint in primary storage capability logic. |
| plugins/hypervisors/kvm/.../LibvirtStoragePool.java | Enables HA support for SharedMountPoint and selects the correct heartbeat script. |
| plugins/hypervisors/kvm/.../KVMStoragePoolManager.java | Sets pool type after creation; exposes getStorageAdaptor. |
| plugins/hypervisors/kvm/.../KVMStoragePool.java | Introduces a default setType method for backward compatibility. |
| plugins/hypervisors/kvm/.../LibvirtCheckVMActivityOnStoragePoolCommandWrapper.java | Ensures retrieved pool has its type set before HA checks. |
| plugins/hypervisors/kvm/.../KVMHAMonitor.java | Expands heartbeat monitoring to include SharedMountPoint pools. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if [ ! -f "$hbFile" ]; then | ||
| # signal large difference if file missing | ||
| return 999999 | ||
| fi | ||
| now=$(date +%s) | ||
| hb=$(cat "$hbFile" 2>/dev/null) | ||
| if [ -z "$hb" ]; then | ||
| return 999998 | ||
| fi | ||
| diff=`expr $now - $hb 2>/dev/null` | ||
| if [ $? -ne 0 ] | ||
| then | ||
| return 999997 | ||
| fi | ||
| if [ -z "$interval" ]; then | ||
| # if no interval provided, consider 0 as success | ||
| if [ $diff -gt 0 ]; then | ||
| return $diff | ||
| else | ||
| return 0 | ||
| fi | ||
| fi | ||
| if [ $diff -gt $interval ] | ||
| then | ||
| return $diff | ||
| fi | ||
| return 0 | ||
| } | ||
|
|
||
| if [ "$rflag" == "1" ] | ||
| then | ||
| check_hbLog | ||
| diff=$? | ||
| if [ $diff == 0 ] |
There was a problem hiding this comment.
return in bash only supports exit codes 0–255; values like 999999 will wrap modulo 256, so diff=$? will not contain the intended “seconds ago” value and can lead to incorrect DEAD/ALIVE reporting. Consider having check_hbLog print the computed diff (e.g., via echo) and use its return code only as success/failure, or store the diff in a global variable and return 0/1.
| if [ ! -f "$hbFile" ]; then | |
| # signal large difference if file missing | |
| return 999999 | |
| fi | |
| now=$(date +%s) | |
| hb=$(cat "$hbFile" 2>/dev/null) | |
| if [ -z "$hb" ]; then | |
| return 999998 | |
| fi | |
| diff=`expr $now - $hb 2>/dev/null` | |
| if [ $? -ne 0 ] | |
| then | |
| return 999997 | |
| fi | |
| if [ -z "$interval" ]; then | |
| # if no interval provided, consider 0 as success | |
| if [ $diff -gt 0 ]; then | |
| return $diff | |
| else | |
| return 0 | |
| fi | |
| fi | |
| if [ $diff -gt $interval ] | |
| then | |
| return $diff | |
| fi | |
| return 0 | |
| } | |
| if [ "$rflag" == "1" ] | |
| then | |
| check_hbLog | |
| diff=$? | |
| if [ $diff == 0 ] | |
| hb_diff=0 | |
| if [ ! -f "$hbFile" ]; then | |
| # signal large difference if file missing | |
| hb_diff=999999 | |
| return 1 | |
| fi | |
| now=$(date +%s) | |
| hb=$(cat "$hbFile" 2>/dev/null) | |
| if [ -z "$hb" ]; then | |
| hb_diff=999998 | |
| return 1 | |
| fi | |
| diff=`expr $now - $hb 2>/dev/null` | |
| if [ $? -ne 0 ] | |
| then | |
| hb_diff=999997 | |
| return 1 | |
| fi | |
| if [ -z "$interval" ]; then | |
| # if no interval provided, consider 0 as success | |
| if [ $diff -gt 0 ]; then | |
| hb_diff=$diff | |
| return 1 | |
| else | |
| hb_diff=0 | |
| return 0 | |
| fi | |
| fi | |
| if [ $diff -gt $interval ] | |
| then | |
| hb_diff=$diff | |
| return 1 | |
| fi | |
| hb_diff=0 | |
| return 0 | |
| } | |
| if [ "$rflag" == "1" ] | |
| then | |
| check_hbLog | |
| status=$? | |
| diff="${hb_diff:-0}" | |
| if [ $status -eq 0 ] |
| if [ $diff == 0 ] | ||
| then | ||
| echo "=====> ALIVE <=====" | ||
| else | ||
| echo "=====> Considering host as DEAD because last write on [$hbFile] was [$diff] seconds ago, but the max interval is [$interval] <======" | ||
| fi |
There was a problem hiding this comment.
This is doing a numeric comparison but uses the string operator == and unquoted $diff. Use numeric operators (-eq, -ne) and quote variables in [ tests (or switch to [[ ... ]]) to avoid mis-comparisons or test errors if the variable is empty/unset.
| if [ $? -gt 0 ] | ||
| then | ||
| return | ||
| fi |
There was a problem hiding this comment.
$? here reflects only the exit status of the last command in the pipeline (awk), not whether ps/grep matched anything. As written, this check won’t behave as intended and is effectively redundant with the later -z "$vmPids" guard. Consider removing this $? block, or use set -o pipefail (with care for script-wide impact) or replace the pipeline with a more direct PID lookup (e.g., pgrep) and check for empty output.
| if [ $? -gt 0 ] | |
| then | |
| return | |
| fi |
| mounts=$(cat /proc/mounts | grep "$MountPoint") | ||
| if [ $? -gt 0 ] | ||
| then | ||
| # mount point not present — we don't remount in local-only script | ||
| # nothing to do here; keep for compatibility with original flow | ||
| : | ||
| else |
There was a problem hiding this comment.
grep "$MountPoint" can produce false positives when $MountPoint is a substring of another mount target (e.g., /mnt/cloudstack1 matching /mnt/cloudstack10). This can incorrectly treat the mount as present and trigger deleteVMs. Consider using a stricter mount check (e.g., matching the mount target field exactly or using findmnt).
There was a problem hiding this comment.
| mounts=$(cat /proc/mounts | grep "$MountPoint") | |
| if [ $? -gt 0 ] | |
| then | |
| # mount point not present — we don't remount in local-only script | |
| # nothing to do here; keep for compatibility with original flow | |
| : | |
| else | |
| mounts=$(cat /proc/mounts | grep "$MountPoint ") | |
| if [ $? -gt 0 ] | |
| then | |
| # mount point not present — we don't remount in local-only script | |
| # nothing to do here; keep for compatibility with original flow | |
| : | |
| else |
...ns/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/storage/KVMStoragePoolManager.java
Outdated
Show resolved
Hide resolved
|
|
||
| public class KVMHAMonitor extends KVMHABase implements Runnable { | ||
|
|
||
| public static final List<StoragePoolType> STORAGE_POOL_TYPES_WITH_HA_SUPPORT = List.of(StoragePoolType.NetworkFilesystem, StoragePoolType.SharedMountPoint); |
There was a problem hiding this comment.
HA-supported pool types are now defined in multiple places (e.g., also in CloudStackPrimaryDataStoreDriverImpl), and LibvirtStoragePool depends on KVMHAMonitor just to reuse this constant. Consider centralizing this list in a small shared utility/helper in an appropriate common package for the KVM plugin (or a single authoritative location) to avoid duplication and reduce cross-class coupling.
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✖️ debian ✔️ suse15. SL-JID 17048 |
|
@blueorangutan package |
|
@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17054 |
|
@blueorangutan test |
|
@weizhouapache a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|
@sureshanaparti @rajujith |
|
[SF] Trillian test result (tid-15586)
|
…/storage/KVMStoragePoolManager.java
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Description
This PR improves KVM HA heartbeat to support SharedMountPoint
Tested
Example of heartbeat files
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?
How did you try to break this feature and the system with this change?