Conflict Between Short-Lived Process Capture (#862) and Multi-Container Environments (#863)

Hey @cfc4n I'm experiencing a fundamental conflict when trying to capture HTTPS traffic from short-lived processes in a multi-container Kubernetes environment. The recommendations from issue #862 (use --pid=0) and #863 (use --pid=SPECIFIC_PID with container paths) are mutually exclusive.
Background
Following the guidance from:

Issue #862: Use --pid=0 to capture short-lived processes that spawn and exit quickly
Issue #863: Use --pid=SPECIFIC_PID with /proc/PID/root/... paths for multi-container environments

However, these approaches conflict in Kubernetes environments where:

Processes are short-lived (<1 second lifespan, e.g., curl commands)
Multiple containers run on the same node with different filesystem namespaces
Process detection and eCapture startup take ~800-1000ms

Current Implementation
Based on advice from #863, I'm using per-PID eCapture instances:
```go
// Detection code
func (o *AutoOrchestrator) startCaptureForLibrary(lib *LibraryInfo) error {
    // Build command with specific PID
    cmd := exec.Command("/ecapture", "tls",
        fmt.Sprintf("--libssl=/proc/%d/root/usr/lib/x86_64-linux-gnu/libssl.so.1.1", lib.PID),
        fmt.Sprintf("--pid=%d", lib.PID),  // Specific PID, not --pid=0
        "-m", "text",
        "--hex=false",
        fmt.Sprintf("--ecaptureq=ws://127.0.0.1:%d/", wsPort))
    
    cmd.Start()
    // ... WebSocket connection logic
}
```

**Detection loop**: Scans `/proc` every 30 seconds to detect new processes with SSL libraries

## What's Happening - The Race Condition

### Timeline of Events:
```shell
T+0ms:    Curl process spawns (PID 275721)
T+50ms:   SSL library loaded
T+200ms:  HTTPS request made
T+500ms:  Curl exits ✅ (request complete)
T+30000ms: Scanner detects PID 275721 in /proc/275721/maps
T+30200ms: eCapture command launched
T+30900ms: eBPF hooks attached
T+31000ms: WebSocket connection established
T+31001ms: ❌ Process is already dead - nothing to capture
```

Actual Logs:
```shell
{"level":"info","time":"2025-11-25T11:42:53Z","message":"🔧 Starting PER-CONTAINER eCapture for PID=275721"}
{"level":"info","time":"2025-11-25T11:42:53Z","message":"✅ eCapture started for Container PID=275721"}
{"level":"info","time":"2025-11-25T11:42:54Z","message":"✅ WebSocket connected for openssl:...:275721"}
{"level":"debug","time":"2025-11-25T11:42:54Z","message":"📋 Process log: {\"target PID\":275721}"}
{"level":"error","time":"2025-11-25T11:42:55Z","message":"❌ WebSocket read error: EOF"}
```

Result: eCapture successfully attaches to PID 275721, but the process exited 30 seconds ago. The WebSocket immediately receives EOF because there's no process to monitor.
The Fundamental Conflict
Requirement--pid=0--pid=SPECIFIC_PIDCapture short-lived processes✅ Works❌ Fails (process dies before attach)Multi-container support❌ Fails (namespace isolation)✅ WorksCapture ongoing processes✅ Works✅ Works

Test Environment

Kubernetes: 3-node cluster (EKS)
Kernel: 6.8.0-1031-azure (eBPF supported)
eCapture: v1.4.3
Test workload: Debian container running:
while true; do
    curl -H "Authorization: Bearer token" https://httpbin.org/get
    sleep 10
  done
Process lifespan: ~500-800ms per curl execution
Scanner interval: 30 seconds (to avoid overloading /proc)

Attempted Solutions
1. ✅ Per-PID unique ports (fixed port collision)
Changed from:
`sessionKey := fmt.Sprintf("%s:%s", lib.LibraryType, lib.LibraryPath)`
To:
`sessionKey := fmt.Sprintf("%s:%s:%d", lib.LibraryType, lib.LibraryPath, lib.PID)`
Result: Port collisions eliminated, but short-lived processes still missed.
2. ❌ Faster scanning (tried 5-second intervals)
Result: High CPU usage, still couldn't catch processes that live <1 second.
3. ❌ Pre-launching eCapture with --pid=0
Problem: Can't use container-specific paths like /proc/275721/root/usr/lib/libssl.so.1.1 with --pid=0 because different containers need different library paths.
Questions

Is it possible to capture short-lived processes (<1s) in multi-container environments?
Can eCapture use --pid=0 with namespace-aware library paths? For example:
`/ecapture tls --libssl=/proc/*/root/usr/lib/libssl.so.1.1 --pid=0`
Does eBPF support "pre-hooking"? Can we attach hooks to a library path before any process loads it, so hooks are already in place when processes spawn?
Alternative approach? Should I:

Accept that short-lived processes can't be captured in multi-container setups?
Use --pid=0 per container namespace (how?)?
Use a different capture strategy entirely?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Conflict Between Short-Lived Process Capture (#862) and Multi-Container Environments (#863) #872

What's Happening - The Race Condition

Timeline of Events:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Conflict Between Short-Lived Process Capture (#862) and Multi-Container Environments (#863) #872

Description

What's Happening - The Race Condition

Timeline of Events:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions