Skip to content

Conversation

@VeNoMZiTo
Copy link
Contributor

@VeNoMZiTo VeNoMZiTo commented Oct 9, 2025

Summary

Disk metrics were missing for LVM/device-mapper volumes because the agent
looked up IO counters using /dev/mapper/<lv>-style names, while gopsutil
exposes IOCounters keyed by kernel device names (e.g., dm-0).

Root cause

  • collectIOStats stripped /dev/ and attempted to query disk.IOCounters() with
    names like mapper/ubuntu--vg-ubuntu--lv or its basename.
  • For LVM, gopsutil reports dm-* keys (e.g., dm-0), and filepath.EvalSymlinks
    does not resolve device-mapper pseudo devices, so no matching entry was found.
  • Result: errors like: device stats not found: /dev/mapper/ubuntu--vg-ubuntu--lv (tried: mapper/ubuntu--vg-ubuntu--lv, ubuntu--vg-ubuntu--lv)

Fix

  • Fetch all IO counters once with disk.IOCounters() and search by key.
  • Build multiple lookup candidates for each device:
  • strip /dev/ (e.g., sda, mapper/vg-lv)
  • basename(device) (e.g., vg-lv)
  • symlink-resolved path when applicable (e.g., /dev/dm-0dm-0)
  • new: map /dev/mapper/<name> to the corresponding dm-* by scanning
    /sys/block/dm-*/dm/name and matching <name> to the dm logical name.
  • Fallback: if no key match, compare against stat.Name.
  • De-duplicate candidates. Windows handling remains unchanged.

Why this works

  • On Linux with LVM, the authoritative IO counter keys are dm-*.
  • /sys/block/dm-*/dm/name reliably exposes the mapper logical name, which
    allows us to derive the correct dm-* key used by gopsutil.

Testing

Environment: Ubuntu (LVM root)

  1. Built agent with this patch, replaced the running binary in the test container.
  2. Queried /api/v1/metrics and confirmed disk metrics now appear for the LVM LV.
  3. Verified no regressions on non-LVM devices (e.g., sda, nvme0n1).
  4. Verified Windows code path is unchanged (no functional changes there).

Before (error): device stats not found: /dev/mapper/ubuntu--vg-ubuntu--lv (tried: mapper/ubuntu--vg-ubuntu--lv, ubuntu--vg-ubuntu--lv)

After (ok):

  • disk.read_bytes, disk.write_bytes, disk.read_time, disk.write_time
    present for the LVM volume.

Performance & compatibility

  • One IOCounters() call per collection cycle (unchanged or cheaper than
    repeated lookups).
  • One small /sys/block scan per device only when a mapper path is detected.
  • No behavioral change on Windows or non-LVM Linux.

Credits

Big thanks to @caleb for identifying the root cause and the right direction.

Checklist

  • Handles LVM /dev/mapper/*dm-*
  • Keeps existing filters (ZFS, loop, special partitions) intact
  • No breaking changes for Windows/non-Linux
  • Added candidate de-dup and stat.Name fallback

Release note

Fix disk IO metrics not showing up on LVM/device-mapper volumes by mapping
/dev/mapper/* to the appropriate dm-* device when reading gopsutil counters.

disk: support LVM/device-mapper by resolving /dev/mapper to dm-* and trying multiple candidates in collectIOStats
Enhanced comments for clarity and added functionality to resolve device mapper paths.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 9, 2025

Walkthrough

Expanded disk IO stats key resolution for LVM/device-mapper by scanning /sys/block to map /dev/mapper and dm-* names to their dm-* counterparts. Added symlink resolution, basename handling, deduplication, and a new helper to locate dm devices via dm/name in sysfs. Imports updated to include os.

Changes

Cohort / File(s) Summary
Disk metrics LVM/dm resolution
internal/metric/disk.go
- Added os import
- Broadened LVM/device-mapper mapping by scanning /sys/block/dm-*/dm/name
- Enhanced key candidate generation: basename, symlink resolution, dm mapping, dedup/trim
- New helper findDMForMapperBase to map /dev/mapper/<name> to dm-* via sysfs

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Collector
  participant K as buildDeviceKeyCandidates
  participant S as Sysfs (/sys/block)
  participant H as findDMForMapperBase

  C->>K: Request key candidates for device path
  K->>K: Add basename and resolved symlink basename
  alt mapper or dm-* basename
    K->>H: findDMForMapperBase(mapperBase)
    H->>S: Read dm-*/dm/name entries
    S-->>H: Return matching dm-* (exact/relaxed)
    H-->>K: dm-* device name (if found)
    K->>K: Include mapped dm-* candidate
  end
  K->>K: Deduplicate and trim keys
  K-->>C: Candidate keys for IOCounters lookup
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

I hop through sysfs, whiskers keen,
Sniffing dm-names where blocks convene.
Mapper paths untangle, neat and prime,
Symlinks trimmed, deduped in time.
With quiet paws I map the way—
IO counters won’t go astray. 🐰💽

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly captures the principal change by describing how disk metrics for LVM volumes are fixed through mapping /dev/mapper/* to dm-* when reading IO counters. It is concise, specific, and accurately reflects the changeset’s intent for any reviewer scanning the history.
Description Check ✅ Passed The description thoroughly explains the motivation, root cause, implementation details, testing, and compatibility, all directly tied to the changeset for LVM disk IO metrics. It is clearly related to the code modifications and does not contain off-topic content, satisfying the lenient description criteria.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@mertssmnoglu mertssmnoglu changed the base branch from main to develop October 9, 2025 13:03
@mertssmnoglu
Copy link
Member

We're using develop branch as the default, so I switched the base branch to develop. There are some git conflicts, can you resolve these conflicts please? Thanks!

@VeNoMZiTo
Copy link
Contributor Author

We're using develop branch as the default, so I switched the base branch to develop. There are some git conflicts, can you resolve these conflicts please? Thanks!

changes made

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
internal/metric/disk.go (1)

189-195: Remove redundant dm- basename check.

The strings.HasPrefix(base, "dm-") branch never yields a new key—findDMForMapperBase("dm-0") returns empty since sysfs dm/name entries hold logical names, not dm-X. Drop || strings.HasPrefix(base, "dm-") to simplify the condition.

-	if strings.HasPrefix(d, "/dev/mapper/") || strings.HasPrefix(base, "dm-") {
+	if strings.HasPrefix(d, "/dev/mapper/") {
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 844f725 and 3a1edfb.

📒 Files selected for processing (1)
  • internal/metric/disk.go (5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build (windows-2025)
  • GitHub Check: lint
🔇 Additional comments (4)
internal/metric/disk.go (4)

4-4: LGTM!

The os import is necessary for the new findDMForMapperBase function.


123-125: LGTM!

The documentation accurately describes the LVM/device-mapper handling.


246-250: Verify the relaxed comparison doesn't cause false positives.

The fallback comparison removes slashes and compares case-insensitively. While this adds robustness for edge cases, it could potentially match the wrong dm device if two devices have similar names (e.g., ubuntu-vg-ubuntu-lv vs ubuntu/vg/ubuntu/lv).

Additionally, the comment mentions LVM encoding - as --, but the logic removes slashes (/), which are less common in LVM names. Could you clarify when this slash-removal logic would be beneficial? Consider either:

  1. Documenting a specific real-world scenario where this is needed, or
  2. Tightening the relaxed comparison to reduce false positive risk (e.g., only case-insensitive, no slash removal).

If you've observed specific udev rule edge cases that require this relaxed comparison, please document them in a code comment with an example.


219-254: Validate performance on systems with many dm devices
The function iterates over /sys/block/dm-* and reads each dm/name on every call, which may add latency if you have 20+ LVM volumes. Since this runs periodically, it’s probably fine, but please benchmark this path on a real system (e.g., measure average execution time over multiple runs) to ensure it remains within acceptable bounds. If you observe significant delays, consider caching the mapping or lowering the call frequency.

"runtime"
"strings"

"github.com/shirou/gopsutil/v4/disk"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please run gofmt -w . to fix import and documentation styles? It will add a new line between internal and external imports, and update the function documentation with correct indentation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the formatting is OK, I would like to merge this PR. @VeNoMZiTo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants