[CI Failure]: mi325_1: Multi-Modal Processor Test

### Name of failing test

`pip install git+https://github.com/TIGER-AI-Lab/Mantis.git && pytest -v -s models/multimodal/processing`

### Basic information

- [ ] Flaky test
- [x] Can reproduce locally
- [ ] Caused by external libraries (e.g. bug in `transformers`)

### 🧪 Describe the failing test

This test validates **multimodal input processing correctness** for vision-language models in vLLM, comparing cached vs non-cached processing paths.

**Purpose:**  
Ensures that the `MultiModalProcessorOnlyCache` produces identical results to baseline (uncached) processing across different multimodal inputs (images, videos, audio).

**Test Flow:**
1. **Parameterized testing** across multiple models, hit rates, and batch configurations
2. **Generates random multimodal data** with controlled cache hit rates (0.3, 0.5, 1.0)
3. **Creates two processors**: baseline (no cache) and cached versions
4. **Compares outputs** for both text and token prompts
5. **Validates equivalence** of processed inputs using `_assert_inputs_equal`

**Key Test Parameters:**
- `model_id`: Various multimodal models (filtered to vLLM-only architectures)
- `hit_rate`: Cache hit probability (30%, 50%, 100%)
- `num_batches`: 32 batches per test run
- `simplify_rate`: Probability of converting multi-item inputs to single items (100%)

**Special Handling:**
- Model-specific patches for GLM4.1V and Qwen3-VL (video metadata requirements)
- Skipped models: `google/gemma-3n-E2B-it`, `OpenGVLab/InternVL2-2B`, `jinaai/jina-reranker-m0` (marked "Fix later")
- Ignores specific keys for certain models (e.g., Ultravox audio_features due to padding differences)

**Assertions:**
Verifies that baseline and cached processors produce byte-for-byte identical outputs for the same inputs, ensuring cache correctness doesn't introduce processing errors.

### 📝 History of failing test

**Test failure history:**

AMD-CI build Buildkite references: 
- 1041
- 1077
- 1088
- 1109
- 1111

**Resolution in progress:**
`150 failed, 384 passed, 285 skipped, 18 warnings in 2672.68s (0:44:32)`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[CI Failure]: mi325_1: Multi-Modal Processor Test #29446

Name of failing test

Basic information

🧪 Describe the failing test

📝 History of failing test

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[CI Failure]: mi325_1: Multi-Modal Processor Test #29446

Description

Name of failing test

Basic information

🧪 Describe the failing test

📝 History of failing test

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions