-
Notifications
You must be signed in to change notification settings - Fork 301
Closed
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationenhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed
Description
While following the Quick Local Installation guide (https://vllm-semantic-router.com/docs/installation/
), I noticed two important gaps that caused confusion.
1. No instructions for disabling the Monitoring Service
The documentation assumes that a monitoring stack (e.g., otel / Prometheus / Grafana) is running.
However, many first-time users do not have monitoring enabled locally, and the guide does not explain:
- How to disable the monitoring service when starting semantic-router
- What configuration or CLI flag is required to skip monitoring entirely for a quick start
Requested Improvement
- How users should disable monitoring if they are not running a monitoring service
- If users do have monitoring set up, where they should change the config (URL, port, expected output, etc.)
2. No documentation for configuring backend vLLM model in config/config.yaml
The installation guide does not describe how to configure the backend LLM engine (vLLM), such as using Llama, Qwen, or other models.
For example:
If we want to change the model or use our own during quick start for different categories, which field in config.yaml should be modified to configure the vLLM backend?
Requested Improvement
- add an example for a Llama-based model or a Qwen-based model, Including which fields users should edit in
config/config.yamlto correctly point semantic-router to their chosen vLLM backend.
Metadata
Metadata
Assignees
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationenhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed