Skip to content

Conversation

@squirrelsc
Copy link
Member

@squirrelsc squirrelsc commented Oct 27, 2025

Summary

Improves Process.wait_result() by adding a raise_on_timeout parameter for better timeout control, fixing result caching logic, and simplifying the code structure.

Key Changes

  • New parameter: raise_on_timeout: bool = True - allows callers to control whether timeouts raise exceptions
  • Fixed caching: Early return for cached results, respects timeout settings on subsequent calls
  • Simplified logic: Removed unnecessary nesting, use local result variable for clearer flow

@squirrelsc squirrelsc requested a review from LiliDeng as a code owner October 27, 2025 22:16
@squirrelsc
Copy link
Member Author

@vyadavmsft Please check this fix, it should raise exception on timeout. It can solve the problem of #4066.

@LiliDeng
Copy link
Collaborator

While testing your PR, I came across an existing bug that was already present in the code.

2025-10-28 06:01:17.033[1796][DEBUG] lisa.local.cmd[2852] cmd: ['cmd', '/c', 'hostname'], cwd: None, shell: True, sudo: False, nohup: False, posix: False, remote: False, encoding: utf-8
2025-10-28 06:01:17.038[7952][ERROR] lisa.env[generated_6] case failed
Traceback (most recent call last):
  File "C:\app\lsg-lisa\lisa\lisa\runners\lisa_runner.py", line 297, in _deploy_environment_task
    self.platform.deploy_environment(environment)
  File "C:\app\lsg-lisa\lisa\lisa\platform_.py", line 188, in deploy_environment
    raise identifier
  File "C:\app\lsg-lisa\lisa\lisa\platform_.py", line 185, in deploy_environment
    self._deploy_environment(environment, log)
  File "C:\app\lsg-lisa\lisa\lisa\sut_orchestrator\azure\platform_.py", line 662, in _deploy_environment
    raise e
  File "C:\app\lsg-lisa\lisa\lisa\sut_orchestrator\azure\platform_.py", line 633, in _deploy_environment
    location, deployment_parameters = self._create_deployment_parameters(
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\app\lsg-lisa\lisa\lisa\sut_orchestrator\azure\platform_.py", line 1231, in _create_deployment_parameters
    arm_parameters.vm_tags["lisa_username"] = local().tools[Whoami].get_username()
                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\app\lsg-lisa\lisa\lisa\tools\whoami.py", line 21, in get_username
    return self.run("", shell=True).stdout.strip()
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\app\lsg-lisa\lisa\lisa\executable.py", line 329, in run
    return process.wait_result(
           ^^^^^^^^^^^^^^^^^^^^
  File "C:\app\lsg-lisa\lisa\lisa\util\process.py", line 384, in wait_result
    assert self._process
           ^^^^^^^^^^^^^
AssertionError
2025-10-28 06:01:17.038[7952][INFO] lisa.runner[0] 'generated_6' attached to test case 'AzureImageStandard.verify_ifcfg_eth0(lisa_0_6)': 

Currently, timeouts in PowerShell tools are not handled properly. An exception is raised with an empty error message. So, the underlying process should always raise exeptions to get errors obvious. To compatible with some rare condition, make it configurable.
This avoids operating on an object that isn't
ready. It clarifies the code without changing its
behavior.
Add thread-safe locking to prevent race condition
where self._result could be None when entering
wait_result(), but self._process could be cleared
by another thread (via _recycle_resource()) before
being used.
@LiliDeng LiliDeng merged commit 8ae13dc into microsoft:main Nov 3, 2025
27 checks passed
@squirrelsc squirrelsc deleted the 10_27_process branch November 3, 2025 21:10
adityagesh pushed a commit that referenced this pull request Nov 7, 2025
Copilot AI pushed a commit that referenced this pull request Nov 14, 2025
LiliDeng added a commit that referenced this pull request Nov 18, 2025
…#4058)

* Initial plan

* Add unified message support for iperf3 TCP and UDP performance metrics

Co-authored-by: squirrelsc <[email protected]>

* Refactor unified message methods to use parsed fields and add connections_num as parameter

Co-authored-by: squirrelsc <[email protected]>

* Merge main and use Parameter relativity for connections_num and buffer_size metrics

Co-authored-by: squirrelsc <[email protected]>

* Remove conn_suffix from metric names

Co-authored-by: squirrelsc <[email protected]>

* Revert "Move examples and microsoft directories into the Python package (#4023)" (#4063)

This reverts commit 89e7b53.

* Reapply "Move examples and microsoft directories into the Python package (#4023)" (#4063)

This reverts commit efe1cd3.

* runbook: fix path for legacy layout

* Add UnifiedMessage support for NetworkLatencyPerformanceMessage

* kdump: Replace CvmDisabled with before_case SecurityProfile check (#4032)

* kdump: Replace CvmDisabled with before_case SecurityProfile check

* kdump: Fix SecurityProfile check to skip only CVM and Stateless VMs

- Remove empty simple_requirement() calls (unnecessary)

- Optimize f-string usage (only use f-prefix where needed)

- Remove unused simple_requirement import

* Add detailed panic categorization and error code extraction

* enrich SerialConsole.check_panic() to return detailed panic

* Added tests for network related components (#4009)

* notifier: remove pytest-html dependency

Replace pytest-html dependency with custom HTML
report generator using string.Template. This
change provides better control over report
formatting and reduces external dependencies.

* runbook: fix microsoft package name for new paths.

The new path is still able to be written like
"microsoft/testsuites", so that it needs to use
"microsoft" instead of "testsuites" as the package
name.

* Remove watchdog pattern from serial console panic detection (#4075)

* fix verify_cpu_count and improve PowerShell

- Implement calculate_vcpu_count() method in
  WindowsLscpu class to fix verify_cpu_count test
  failure on Windows
- Add null check for stderr in
  PowerShell.wait_result() to prevent errors when
  PowerShell is used to run cmd commands with no
  stderr output

* iDRAC: Handle  HTTP 500 internal errors with service reset

* Fix Hyper-V Stop-VM to use TurnOff on timeout/failure

* Remove overly broad stall regex pattern causing false positive panic detections (#4082)

* Initial plan

* Remove overly broad stall regex pattern to prevent false alarms

Co-authored-by: lesscodingmorehappiness <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: lesscodingmorehappiness <[email protected]>

* Revert "skip test if hv_netvsc driver is not used"

This reverts commit f6fdcf7.

* change kselftest required /tmp/ size to 1GB for Overlake SoC limited space

* Add enabled switch for environments and nodes

This change introduces an `enabled` boolean field
at both the environment and node levels, allowing
selective loading of configurations through
runbook variables.

Example:
  environment:
    - name: my_env
      enabled: $(use_first_env)  # Variable-controlled
      nodes:
        - type: local
          name: node1
          enabled: true
        - type: local
          name: node2
          enabled: false  # Skip this node

* Process: Raise exception on timeout. (#4077)

* Skip tests on L1VH Nodes (#4078)

* mshv: skip checking logfile size on l1vh

L1VH parents by default don't have any entries in mshvlog file. Skip
checking logfile size on these nodes.

Signed-off-by: Praveen K Paladugu <[email protected]>

* mshv: skip mshvtrace test on l1vh Nodes

L1VH nodes cannot collect performance traces. Skip the related test
on the L1VH nodes.

Signed-off-by: Praveen K Paladugu <[email protected]>

---------

Signed-off-by: Praveen K Paladugu <[email protected]>

* Set minimum TLS setting 1.2 for storage accounts

Support for TLS 1.0 and 1.1 will be discontinued for all Azure Storage
accounts. The guidance is to migrate to minumum TLS version 1.2.

https://learn.microsoft.com/en-us/azure/storage/common/transport-layer-security-configure-migrate-to-tls2#why-use-tls-12

* Fix IPTable Test (#4088)

* Add virtualization feature

* doc: fix doc path after test code moved.

* doc: fix some build warnings.

* doc: allow duplicate test case names in different test suites.

* Fix VHD schema documentation to show nested hyperv_generation field (#4100)

* changes to install xxhash tool before building kernel

* Modrpobe command update for verbose is false

* Document resource_group_tags parameter for Azure runbook (#4101)

* Add Host version tracking for baremetal and HyperV platforms

* Convert GPU Driver installation to Tool, Add amd-smi (#4080)

* ch perf: Implement comprehensive performance stabilization framework

* Classify /bin/true redirections in kernel modules as not loaded

Previously, `is_module_loaded` returned True (loaded) when `modprobe -nv`
produced a blacklist directive like 'install /bin/true', causing test
cases like verify_floppy_module_is_blacklisted although module was not
actually loaded.

Added a minimal check for the install /bin/true pattern and now treat it
as not loaded, returning False.

* Kdump: Enhnace error log for incomplete dump file

* Update Nested Feature Supported list in Azure

* Create dm-cache test (#4093)

* Fix nvme device path fetch logic

* DPDK: add netvsc rescind tests (#4076)

* Remove squirrelsc from CODEOWNERS file

Co-authored-by: squirrelsc <[email protected]>

* UnifiedPerfMessage: add metric_str_value to store string value (#4107)

* UnifiedPerfMessage: add str_value to store string value

* Rename str_value to metric_str_value in UnifiedPerfMessage (#4108)

* Initial plan

* Rename str_value to metric_str_value for consistency

Co-authored-by: squirrelsc <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: squirrelsc <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: squirrelsc <[email protected]>

* Pass through MIGRATABLE_VERSION from pipeline environment

* Add UnifiedMessage support for NetworkPPSPerformanceMessage (#4057)

* Initial plan

* Rebase on latest main branch

* Initial plan

* Initial plan

* Rebase on latest main branch

* Sync latest code from main branch

* Clean commit history - single commit for PR changes

* Add connections_num and buffer_size to metric names as suffix

- Remove separate connections_num and buffer_size_bytes metrics
- Add suffix format: _conn_{connections_num}_buffer_{buffer_size}
- Apply suffix to all TCP metrics: rx/tx_throughput_in_gbps, congestion_windowsize_kb, retransmitted_segments
- Apply suffix to all UDP metrics: rx/tx_throughput_in_gbps, data_loss
- This allows distinguishing results by connection count and buffer size

Co-authored-by: LiliDeng <[email protected]>

* Fix flake8 errors: remove trailing whitespace from blank lines

- Remove trailing whitespace from line 492 in send_iperf3_tcp_unified_perf_messages
- Remove trailing whitespace from line 534 in send_iperf3_udp_unified_perf_messages
- Fixes W293 flake8 warnings and BLK100 black formatting issue

Co-authored-by: LiliDeng <[email protected]>

---------

Signed-off-by: Praveen K Paladugu <[email protected]>
Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: squirrelsc <[email protected]>
Co-authored-by: LiliDeng <[email protected]>
Co-authored-by: Chi Song (from Dev Box) <[email protected]>
Co-authored-by: Vivek Yadav <[email protected]>
Co-authored-by: Balashivaram Ganesan <[email protected]>
Co-authored-by: lesscodingmorehappiness <[email protected]>
Co-authored-by: Panfeng Xue <[email protected]>
Co-authored-by: Praveen K Paladugu <[email protected]>
Co-authored-by: Sebastian Heid <[email protected]>
Co-authored-by: Umang Francis <[email protected]>
Co-authored-by: rabdulfaizy <[email protected]>
Co-authored-by: Aditya Nagesh <[email protected]>
Co-authored-by: Rachel Menge <[email protected]>
Co-authored-by: Kanchan Sen Laskar <[email protected]>
Co-authored-by: mcgov <[email protected]>
Co-authored-by: LiliDeng <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants