-
-
Notifications
You must be signed in to change notification settings - Fork 11.6k
[Bugfix] Fix O(n²) multimodal string prompt processing #29667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix] Fix O(n²) multimodal string prompt processing #29667
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run You ask your reviewers to trigger select CI tests on top of Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request provides an excellent fix for a critical O(n²) performance issue in multimodal prompt processing. The problem was correctly identified as an inefficient loop that repeatedly scanned the prompt when no matches were found. The fix is simple, elegant, and effective, replacing the unnecessary loop increment with a break statement. The accompanying tests are thorough, including specific performance tests to validate the fix and regression tests to ensure no existing functionality is broken. This is a high-quality contribution that significantly improves performance.
2f835ce to
147d97f
Compare
_apply_matches() had O(n²) complexity when processing multimodal prompts. When no match was found, the loop incremented start_idx but _find_matches() used prev_end_idx (unchanged), re-scanning the entire prompt each iteration. Fix: Exit loop when no matches remain instead of incrementing uselessly. Impact: InternVL 5k token prompt: 2.6s → <10ms for processing. Signed-off-by: mertunsall <[email protected]>
Signed-off-by: mertunsall <[email protected]>
147d97f to
513557d
Compare
Signed-off-by: mertunsall <[email protected]>
Signed-off-by: mertunsall <[email protected]>
|
Fixes #25327 |
|
After serving the model This benchmarking script gets significantly faster after this PR. |
Signed-off-by: mertunsall <[email protected]>
|
@DarkLight1337 I edited the PR to get rid of start_idx as well! |
|
main branch Yep, still works. It makes sense since my PR only tackles the case when MM data is passed in, while yours applies to text-only inputs as well. |
|
Should be good to merge! |
|
Probably you have to fix some merge conflicts after #29668 is merged (which should happen quite soon) |
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: mertunsall <[email protected]>
|
Hi @DarkLight1337 @Isotr0py - fixed the merge conflicts, would be very happy if we can merge! |
|
What’s the reason checks don’t pass? |
|
should we update? @ywang96 |
Should be fine - the CI's still running and we don't need the PR to be completely aligned with main to be merged |
|
Known failure on main |
Problem
_apply_matches()had O(n²) complexity - a 5000 token multimodal prompt took 2.6s for processing alone.Cause
When no match found, the loop incremented
start_idxbut_find_matches()usedprev_end_idx(unchanged), re-scanning the entire prompt each iteration.Fix
Exit loop when no matches remain instead of incrementing uselessly.
Impact
Test Plan
python -m pytest tests/multimodal/test_processing.py::TestApplyMatchesPerformance -v
Test Result
Passed