Add enabled switch for environments and nodes

squirrelsc · LiliDeng · commit 030485cfe1e5 · 2025-10-31T14:30:41.000+08:00
This change introduces an `enabled` boolean field
at both the environment and node levels, allowing
selective loading of configurations through
runbook variables.

Example:
  environment:
    - name: my_env
      enabled: $(use_first_env)  # Variable-controlled
      nodes:
        - type: local
          name: node1
          enabled: true
        - type: local
          name: node2
          enabled: false  # Skip this node
diff --git a/docs/run_test/runbook.rst b/docs/run_test/runbook.rst
@@ -9,6 +9,7 @@ Runbook Reference
    -  `Use variable and secrets <#use-variable-and-secrets>`__
    -  `Use partial runbook <#use-partial-runbook>`__
    -  `Use extensions <#use-extensions>`__
+   -  `Conditionally enable/disable environments or nodes <#conditionally-enable-disable-environments-or-nodes>`__
 
 -  `Reference <#reference>`__
 
@@ -89,6 +90,7 @@ Runbook Reference
       -  `environments <#environments>`__
 
          -  `name <#name-4>`__
+         -  `enabled <#enabled>`__
          -  `topology <#topology>`__
          -  `nodes <#nodes>`__
          -  `nodes_requirement <#nodes-requirement>`__
@@ -153,8 +155,8 @@ name ``hello``.
 
 Below section demonstrates how to configure test cases with retry, repetition,
 and timeout settings. The first test case will automatically retry up to 2 times
-if it fails, redeploying the environment for each retry attempt. The second test 
-case demonstrates stress testing by running 3 times unconditionally (regardless 
+if it fails, redeploying the environment for each retry attempt. The second test
+case demonstrates stress testing by running 3 times unconditionally (regardless
 of pass/fail) with a custom timeout of 1 hour.
 
 .. code:: yaml
@@ -257,6 +259,59 @@ modules for test cases or extended features.
        path: ../../extensions
      - ../../lisa/microsoft/testsuites/core
 
+Conditionally enable/disable environments or nodes
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+You can use the ``enabled`` field to conditionally enable or disable entire
+environments or individual nodes within an environment. This is particularly
+useful when combined with variables for dynamic configuration.
+
+Below example shows how to enable/disable environments based on a variable:
+
+.. code:: yaml
+
+   variable:
+     - name: use_prod
+       value: true
+     - name: use_dev
+       value: false
+
+   environment:
+     environments:
+       - name: production_env
+         enabled: $(use_prod)  # Controlled by variable
+         nodes:
+           - type: local
+       - name: dev_env
+         enabled: $(use_dev)  # This environment will be skipped
+         nodes:
+           - type: local
+
+Below example shows how to selectively disable specific nodes within an environment:
+
+.. code:: yaml
+
+   environment:
+     environments:
+       - name: multi_node_env
+         nodes:
+           - name: primary_node
+             type: local
+             enabled: true  # Always enabled
+           - name: secondary_node
+             type: local
+             enabled: false  # Temporarily disabled
+           - name: optional_node
+             type: remote
+             address: 192.168.1.100
+             enabled: $(include_remote_node)  # Variable-controlled
+
+This allows you to:
+
+- Temporarily disable environments or nodes without deleting their configuration
+- Use variables to control which environments/nodes are active
+- Maintain multiple environment configurations and switch between them dynamically
+
 Use transformers
 ~~~~~~~~~~~~~~~~
 
@@ -794,7 +849,7 @@ test execution logs and code context from the LISA framework.
 The log_agent notifier uses a multi-agent AI system that combines:
 
 - **LogSearchAgent**: Specialized in searching and analyzing log files for error patterns
-- **CodeSearchAgent**: Examines source code files and analyzes implementations related to errors  
+- **CodeSearchAgent**: Examines source code files and analyzes implementations related to errors
 - **Magentic Orchestration**: Coordinates the agents to provide comprehensive analysis
 
 The analysis results are attached to test result messages and made available to
@@ -809,7 +864,7 @@ downstream notifiers and reporting systems.
 
 2. **Required Python packages** (automatically included with LISA):
    - python-dotenv
-   - semantic-kernel  
+   - semantic-kernel
    - azure-ai-inference
    - retry
 
@@ -827,7 +882,7 @@ azure_openai_api_key
 
 type: str, optional, default: ""
 
-Azure OpenAI API key for authentication. If not set, the notifier will use 
+Azure OpenAI API key for authentication. If not set, the notifier will use
 default authentication methods available in the environment.
 
 Note: This value is automatically marked as secret and will be masked in logs.
@@ -897,7 +952,7 @@ Example of log_agent notifier:
 5. **Evidence Gathering**: Searches for supporting evidence in logs
 6. **Root Cause Analysis**: Provides comprehensive analysis with actionable insights
 
-The AI analysis results are stored in the test result message's ``analysis["AI"]`` 
+The AI analysis results are stored in the test result message's ``analysis["AI"]``
 field and can be consumed by other notifiers like HTML or custom reporting systems.
 
 environment
@@ -925,6 +980,30 @@ type: str, optional, default is empty
 
 The name of the environment.
 
+enabled
+'''''''
+
+type: bool, optional, default is true
+
+Controls whether the environment is loaded and used during test execution. When
+set to ``false``, the environment will be skipped during initialization. This is
+useful for definining multiple similar environments in the same runbook.
+
+Example:
+
+.. code:: yaml
+
+   environment:
+     environments:
+       - name: prod_env
+         enabled: true  # This environment will be loaded
+         nodes:
+           - type: local
+       - name: dev_env
+         enabled: $(use_dev_env)  # Variable-controlled
+         nodes:
+           - type: local
+
 topology
 ''''''''
 
@@ -939,6 +1018,32 @@ List of node, it can be a virtual machine on Azure or Hyper-V, bare metal or
 others. For more information, refer to :ref:`write_test/concepts:node and
 environment`.
 
+Each node supports an ``enabled`` field:
+
+**enabled** (bool, optional, default is true): Controls whether the node is
+loaded during environment initialization. When set to ``false``, the node will
+be skipped. This is useful for selecting specific nodes from the same
+environment configuration.
+
+Example:
+
+.. code:: yaml
+
+   environment:
+     environments:
+       - name: test_env
+         nodes:
+           - name: node1
+             type: local
+             enabled: true  # This node will be loaded
+           - name: node2
+             type: local
+             enabled: false  # This node will be skipped
+           - name: node3
+             type: remote
+             address: 192.168.1.100
+             enabled: $(enable_node3)  # Variable-controlled
+
 nodes_requirement
 '''''''''''''''''
 
@@ -1038,15 +1143,15 @@ timeout
 
 type: int, optional, default is 0
 
-Timeout in seconds for each test case. When a test case runs, LISA uses the 
-maximum value between the timeout specified in the runbook and the test case's 
-own metadata timeout. If this field is set to 0 (default) or not specified, only 
-the test case's metadata timeout is used (which defaults to 3600 seconds / 1 hour 
-if not explicitly set in the test case). This allows you to extend timeouts for 
+Timeout in seconds for each test case. When a test case runs, LISA uses the
+maximum value between the timeout specified in the runbook and the test case's
+own metadata timeout. If this field is set to 0 (default) or not specified, only
+the test case's metadata timeout is used (which defaults to 3600 seconds / 1 hour
+if not explicitly set in the test case). This allows you to extend timeouts for
 specific test runs without modifying the test case code.
 
-Note that this timeout applies to the overall test case execution. Any additional 
-command-level timeouts set within the test case code itself will not be affected 
+Note that this timeout applies to the overall test case execution. Any additional
+command-level timeouts set within the test case code itself will not be affected
 by this setting.
 
 .. code:: yaml
diff --git a/lisa/environment.py b/lisa/environment.py
@@ -438,6 +438,12 @@ def _reset(self) -> None:
 
         has_default_node = False
         for node_runbook in self.runbook.nodes:
+            # Skip disabled nodes
+            if not node_runbook.enabled:
+                node_name = node_runbook.name or "unnamed"
+                self.log.info(f"skipping to load disabled node: {node_name}")
+                continue
+
             self.create_node_from_exists(
                 node_runbook=node_runbook,
             )
@@ -545,9 +551,17 @@ def load_environments(
         environments_runbook = root_runbook.environments
         for environment_runbook in environments_runbook:
             id_ = _get_environment_id()
+            name = environment_runbook.name or f"customized_{id_}"
+
+            # Skip disabled environments
+            if not environment_runbook.enabled:
+                log = _get_init_logger()
+                log.info(f"skipping to load disabled environment: {name}")
+                continue
+
             env = environments.from_runbook(
                 runbook=environment_runbook,
-                name=environment_runbook.name or f"customized_{id_}",
+                name=name,
                 is_predefined_runbook=True,
                 id_=id_,
             )
diff --git a/lisa/schema.py b/lisa/schema.py
@@ -1246,6 +1246,10 @@ class Node(TypedSchema, ExtendableSchemaMixin):
     name: str = ""
     is_default: bool = field(default=False)
 
+    # A node is disabled if it's False. It helps to disable node by
+    # variables.
+    enabled: bool = True
+
 
 @dataclass_json()
 @dataclass
@@ -1372,6 +1376,10 @@ class Environment:
     )
     nodes_requirement: Optional[List[NodeSpace]] = None
 
+    # An environment is disabled if it's False. It helps to disable environment
+    # by variables.
+    enabled: bool = True
+
     _original_nodes_requirement: Optional[List[NodeSpace]] = None
 
     def __post_init__(self, *args: Any, **kwargs: Any) -> None:
diff --git a/selftests/test_environment.py b/selftests/test_environment.py
@@ -292,3 +292,90 @@ def test_create_from_custom_local_remote(self) -> None:
                     self.assertEqual(r_n.custom_remote_field, CUSTOM_REMOTE)
                     done += 1
             self.assertEqual(2, done)
+
+    def test_disabled_environment_not_loaded(self) -> None:
+        # Create runbook with 3 environments: 2 enabled, 1 disabled
+        data = {
+            constants.ENVIRONMENTS: [
+                {
+                    "name": "enabled_env_1",
+                    constants.NODES: [
+                        {
+                            constants.TYPE: constants.ENVIRONMENTS_NODES_LOCAL,
+                        }
+                    ],
+                },
+                {
+                    "name": "disabled_env",
+                    "enabled": False,
+                    constants.NODES: [
+                        {
+                            constants.TYPE: constants.ENVIRONMENTS_NODES_REMOTE,
+                            constants.ENVIRONMENTS_NODES_REMOTE_ADDRESS: "1.2.3.4",
+                            constants.ENVIRONMENTS_NODES_REMOTE_PORT: 22,
+                            constants.ENVIRONMENTS_NODES_REMOTE_USERNAME: "user",
+                            constants.ENVIRONMENTS_NODES_REMOTE_PASSWORD: "pass",
+                        }
+                    ],
+                },
+                {
+                    "name": "enabled_env_2",
+                    constants.NODES: [
+                        {
+                            constants.TYPE: constants.ENVIRONMENTS_NODES_LOCAL,
+                        }
+                    ],
+                },
+            ]
+        }
+        runbook = schema.load_by_type(schema.EnvironmentRoot, data)
+        envs = load_environments(runbook)
+
+        # Only 2 environments should be loaded (disabled one skipped)
+        self.assertEqual(2, len(envs))
+        self.assertIn("enabled_env_1", envs)
+        self.assertIn("enabled_env_2", envs)
+        self.assertNotIn("disabled_env", envs)
+
+    def test_disabled_node_not_loaded(self) -> None:
+        # Create environment with 3 nodes: 2 enabled, 1 disabled
+        data = {
+            constants.ENVIRONMENTS: [
+                {
+                    "name": "test_env",
+                    constants.NODES: [
+                        {
+                            "name": "node1",
+                            constants.TYPE: constants.ENVIRONMENTS_NODES_LOCAL,
+                        },
+                        {
+                            "name": "node2",
+                            "enabled": False,
+                            constants.TYPE: constants.ENVIRONMENTS_NODES_REMOTE,
+                            constants.ENVIRONMENTS_NODES_REMOTE_ADDRESS: "1.2.3.4",
+                            constants.ENVIRONMENTS_NODES_REMOTE_PORT: 22,
+                            constants.ENVIRONMENTS_NODES_REMOTE_USERNAME: "user",
+                            constants.ENVIRONMENTS_NODES_REMOTE_PASSWORD: "pass",
+                        },
+                        {
+                            "name": "node3",
+                            constants.TYPE: constants.ENVIRONMENTS_NODES_LOCAL,
+                        },
+                    ],
+                },
+            ]
+        }
+        runbook = schema.load_by_type(schema.EnvironmentRoot, data)
+        envs = load_environments(runbook)
+
+        # Environment should be loaded
+        self.assertEqual(1, len(envs))
+        env = envs.get("test_env")
+        assert env
+
+        # Only 2 nodes should be loaded (disabled one skipped)
+        self.assertEqual(2, len(env.nodes))
+        node_names = [node.name for node in env.nodes.list()]
+        self.assertIn("node1", node_names)
+        self.assertIn("node3", node_names)
+        self.assertNotIn("node2", node_names)
diff --git a/selftests/test_platform.py b/selftests/test_platform.py
@@ -156,7 +156,7 @@ def test_prepared_env_not_success_with_exception(self) -> None:
             "no capability found for environment: Environment("
             "name='customized_0', topology='subnet', nodes_raw=[{'type': 'local', "
             "'capability': {'core_count': {'min': 4}}}], nodes_requirement=None, "
-            "_original_nodes_requirement=None)",
+            "enabled=True, _original_nodes_requirement=None)",
             str(cm.exception),
         )
 

Original file line number	Diff line number	Diff line change
`@@ -156,7 +156,7 @@ def test_prepared_env_not_success_with_exception(self) -> None:`
`156`	`156`	`"no capability found for environment: Environment("`
`157`	`157`	`"name='customized_0', topology='subnet', nodes_raw=[{'type': 'local', "`
`158`	`158`	`"'capability': {'core_count': {'min': 4}}}], nodes_requirement=None, "`
`159`		`- "_original_nodes_requirement=None)",`
	`159`	`+ "enabled=True, _original_nodes_requirement=None)",`
`160`	`160`	`str(cm.exception),`
`161`	`161`	`)`
`162`	`162`