Skip to content

Documentation: Clarify Quick Local Installation #675

@yuezhu1

Description

@yuezhu1

While following the Quick Local Installation guide (https://vllm-semantic-router.com/docs/installation/
), I noticed two important gaps that caused confusion.

1. No instructions for disabling the Monitoring Service

The documentation assumes that a monitoring stack (e.g., otel / Prometheus / Grafana) is running.
However, many first-time users do not have monitoring enabled locally, and the guide does not explain:

  1. How to disable the monitoring service when starting semantic-router
  2. What configuration or CLI flag is required to skip monitoring entirely for a quick start

Requested Improvement

  1. How users should disable monitoring if they are not running a monitoring service
  2. If users do have monitoring set up, where they should change the config (URL, port, expected output, etc.)

2. No documentation for configuring backend vLLM model in config/config.yaml

The installation guide does not describe how to configure the backend LLM engine (vLLM), such as using Llama, Qwen, or other models.

For example:
If we want to change the model or use our own during quick start for different categories, which field in config.yaml should be modified to configure the vLLM backend?

Requested Improvement

  1. add an example for a Llama-based model or a Qwen-based model, Including which fields users should edit in config/config.yaml to correctly point semantic-router to their chosen vLLM backend.

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentationenhancementNew feature or requestgood first issueGood for newcomershelp wantedExtra attention is needed

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions