Skip to content

Conversation

@tjmoyes
Copy link

@tjmoyes tjmoyes commented Mar 25, 2025

machine_memory_bytes and machine_cpu_cores have been [deprecated] (https://github.com/kubernetes/kube-state-metrics/blob/main/docs/metrics/cluster/node-metrics.md) [as per StackOverflow] (https://stackoverflow.com/questions/63901926/how-to-query-the-total-memory-available-to-kubernetes-nodes)

This issue came up for me when querying Azure Managed Prometheus, as these queries no longer exist there. I've tested these queries on both manually deployed Prometheus as well as Azure Managed Prometheus and the query works and returns the same data.

…ch no longer works in Azure Monitored Prometheus
@CLAassistant
Copy link

CLAassistant commented Mar 25, 2025

CLA assistant check
All committers have signed the CLA.

@coderabbitai
Copy link

coderabbitai bot commented Oct 26, 2025

Walkthrough

The Prometheus queries for cluster summary metrics were updated to retrieve data from kube_node_status_capacity with resource labels instead of machine_memory_bytes and machine_cpu_cores. Data gathering logic and control flow remain unchanged.

Changes

Cohort / File(s) Summary
Prometheus metric queries
robusta_krr/core/integrations/prometheus/metrics_service/prometheus_metrics_service.py
Updated get_cluster_summary queries to use kube_node_status_capacity metric with resource labels (cpu, memory) instead of machine_memory_bytes and machine_cpu_cores for computing cluster resource totals

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10–15 minutes

  • Verify that kube_node_status_capacity metric is available in typical Prometheus setups and returns equivalent data
  • Confirm the resource label filtering (cpu, memory) correctly aggregates cluster totals
  • Check for any edge cases where the new metric might behave differently from the previous sources

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "Update memory_query and cpu_query for Prometheus" accurately summarizes the main change in the changeset. According to the raw summary, the changes switch from using deprecated metrics (machine_memory_bytes and machine_cpu_cores) to using kube_node_status_capacity for cluster memory and CPU queries in Prometheus. The title is concise, specific, and clearly indicates what is being updated and for which service, allowing teammates scanning the history to quickly understand the primary purpose of this PR.
Description Check ✅ Passed The pull request description is directly related to the changeset and provides meaningful context for the changes. It explains that the deprecated metrics (machine_memory_bytes and machine_cpu_cores) caused issues with Azure Managed Prometheus, references relevant documentation, and states that the author tested the new queries on both manually deployed Prometheus and Azure Managed Prometheus to verify they return equivalent data. The description is neither vague nor off-topic; it clearly communicates the motivation and validation for the update.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7afd7d0 and 9988554.

📒 Files selected for processing (1)
  • robusta_krr/core/integrations/prometheus/metrics_service/prometheus_metrics_service.py (1 hunks)

Comment on lines 256 to 258
memory_query = f"""
sum(max by (instance) (machine_memory_bytes{{ {single_cluster_label} }}))
sum(kube_node_status_capacity{{ resource='memory' {single_cluster_label} }})
"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix PromQL syntax: use double quotes for label values.

PromQL requires double quotes for label values, not single quotes. The current syntax resource='memory' will cause a query syntax error.

Apply this diff to fix the syntax:

         memory_query = f"""
-            sum(kube_node_status_capacity{{ resource='memory' {single_cluster_label} }})
+            sum(kube_node_status_capacity{{ resource="memory" {single_cluster_label} }})
         """
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
memory_query = f"""
sum(max by (instance) (machine_memory_bytes{{ {single_cluster_label} }}))
sum(kube_node_status_capacity{{ resource='memory' {single_cluster_label} }})
"""
memory_query = f"""
sum(kube_node_status_capacity{{ resource="memory" {single_cluster_label} }})
"""
🤖 Prompt for AI Agents
In
robusta_krr/core/integrations/prometheus/metrics_service/prometheus_metrics_service.py
around lines 256 to 258, the PromQL uses single quotes for the label value
(resource='memory') which is invalid; change the label value to use double
quotes (resource="memory") in the f-string query, and scan nearby queries for
other label values using single quotes and convert them to double quotes as
well.

Comment on lines 260 to 262
cpu_query = f"""
sum(max by (instance) (machine_cpu_cores{{ {single_cluster_label} }}))
sum(kube_node_status_capacity{{ resource='cpu' {single_cluster_label} }})
"""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Fix PromQL syntax: use double quotes for label values.

PromQL requires double quotes for label values, not single quotes. The current syntax resource='cpu' will cause a query syntax error.

Apply this diff to fix the syntax:

         cpu_query = f"""
-            sum(kube_node_status_capacity{{ resource='cpu' {single_cluster_label} }})
+            sum(kube_node_status_capacity{{ resource="cpu" {single_cluster_label} }})
         """

After applying the fixes, please verify that both queries execute successfully in your Prometheus environments:


🏁 Script executed:

#!/bin/bash
# Description: Test the corrected Prometheus queries return valid results
# This verifies the syntax is correct and metrics exist

# Note: Replace PROMETHEUS_URL with your actual Prometheus endpoint
PROMETHEUS_URL="${PROMETHEUS_URL:-http://localhost:9090}"

echo "Testing memory query..."
curl -s "${PROMETHEUS_URL}/api/v1/query" \
  --data-urlencode 'query=sum(kube_node_status_capacity{resource="memory"})' | jq '.data.result'

echo -e "\nTesting CPU query..."
curl -s "${PROMETHEUS_URL}/api/v1/query" \
  --data-urlencode 'query=sum(kube_node_status_capacity{resource="cpu"})' | jq '.data.result'

Length of output: 45


I need to verify the PromQL syntax rules and inspect the actual codebase since the Prometheus endpoint is unavailable in the sandbox.

Now let me inspect the actual file to verify the current state of the code:


---

🏁 Script executed:

```shell
#!/bin/bash
# Inspect the prometheus_metrics_service.py file around lines 260-262
fd -t f "prometheus_metrics_service.py" --exec cat {} +

Length of output: 16651


Fix PromQL syntax: use double quotes for label values in both queries.

PromQL requires double quotes for label values, not single quotes. The current syntax with single quotes will cause query syntax errors. Both the memory_query and cpu_query contain this issue.

Apply this diff to fix both queries:

         memory_query = f"""
-            sum(kube_node_status_capacity{{ resource='memory' {single_cluster_label} }})
+            sum(kube_node_status_capacity{{ resource="memory" {single_cluster_label} }})
         """

         cpu_query = f"""
-            sum(kube_node_status_capacity{{ resource='cpu' {single_cluster_label} }})
+            sum(kube_node_status_capacity{{ resource="cpu" {single_cluster_label} }})
         """

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In
robusta_krr/core/integrations/prometheus/metrics_service/prometheus_metrics_service.py
around lines 260 to 262, the PromQL label values use single quotes which is
invalid; update the cpu_query and the corresponding memory_query to use
double-quoted label values (e.g. resource="cpu" and resource="memory"), making
sure the f-string and surrounding triple-quoted string are adjusted so the
double quotes appear in the final PromQL expression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants