-
Notifications
You must be signed in to change notification settings - Fork 419
Description
Summary
The search_code MCP tool consistently fails with "codebase not indexed" error despite:
- Successful completion of
index_codebaseoperation - Confirmation from
get_indexing_statusthat codebase is "fully indexed and ready" - Verified data stored in Milvus vector database
This appears to be a state synchronization bug where the indexing operation succeeds but the search operation cannot access the indexed data.
Environment
MCP Configuration
{
"mcpServers": {
"claude-context": {
"command": "npx",
"args": ["@zilliz/claude-context-mcp@latest"],
"env": {
"EMBEDDING_PROVIDER": "Ollama",
"OLLAMA_HOST": "http://localhost:11434",
"EMBEDDING_MODEL": "nomic-embed-text",
"MILVUS_ADDRESS": "localhost:19530",
"HYBRID_MODE": "false",
"EMBEDDING_BATCH_SIZE": "100",
"SPLITTER_TYPE": "ast"
}
}
}
}Infrastructure
- Vector Database: Milvus v2.5.7 (standalone deployment)
- 3 containers: etcd v3.5.18, minio RELEASE.2024-05-28T17-19-04Z, milvus v2.5.7
- All containers healthy and responding
- gRPC endpoint: localhost:19530 (accessible)
- HTTP health endpoint: localhost:19121/healthz (returns 200 OK)
- Embeddings: Ollama running on localhost:11434
- Model: nomic-embed-text:latest (137M params, F16 quantization)
- API responding correctly (
/api/tagsreturns model list)
- Operating System: Linux 5.15.0-139-generic
- Client: Claude Code CLI (Sonnet 4.5)
Configuration Notes
HYBRID_MODE: false- Dense vector search only (BM25 disabled due to known Milvus v2.5.7 BM25 bug)SPLITTER_TYPE: ast- AST-based code chunking for syntax-aware splittingEMBEDDING_BATCH_SIZE: 100- Batch size for embedding generation
Steps to Reproduce
-
Start infrastructure:
# Start Milvus containers docker compose up -d milvus-etcd milvus-minio milvus # Verify all healthy docker ps --filter "name=my-milvus"
-
Configure MCP server (via
claude mcp addor config file as shown above) -
Restart Claude Code CLI to load MCP configuration
-
Index codebase:
Tool: mcp__claude-context__index_codebase Parameters: { "path": "<path-to-repo>", "splitter": "ast" }Result: ✅ Success - "Indexed 174 files, 1,983 chunks"
-
Verify indexing status:
Tool: mcp__claude-context__get_indexing_status Parameters: { "path": "<path-to-repo>" }Result: ✅ Success - "Codebase is fully indexed and ready"
-
Attempt search:
Tool: mcp__claude-context__search_code Parameters: { "path": "<path-to-repo>", "query": "docker compose fragment architecture", "limit": 10 }Result: ❌ FAILURE - "The codebase at path is not indexed"
Expected Behavior
After successful indexing (confirmed by get_indexing_status), the search_code tool should:
- Accept the query
- Search the indexed codebase in Milvus
- Return relevant code chunks matching the query
Actual Behavior
search_code immediately fails with error:
The codebase at path <path-to-repo> is not indexed. Please use the index_codebase tool first.
This error is inconsistent with reality because:
index_codebasecompleted successfullyget_indexing_statusconfirms "fully indexed and ready"- Milvus container logs show successful operations
- Re-indexing produces the same failure pattern
Reproduction Rate
100% reproducible across:
- Multiple indexing attempts
- Multiple Claude Code CLI session restarts
- Multiple search queries with different terms
- Multiple days of testing
Investigation Details
Milvus Container Health
$ docker ps --filter "name=my-milvus"
CONTAINER ID IMAGE STATUS
a1b2c3d4e5f6 milvusdb/milvus:v2.5.7 Up (healthy)
f6e5d4c3b2a1 minio/minio:RELEASE.2024-05-28... Up (healthy)
6f5e4d3c2b1a quay.io/coreos/etcd:v3.5.18 Up (healthy)
$ curl -s http://localhost:19121/healthz
{"status":"ok"}Ollama Embedding Service
$ curl -s http://localhost:11434/api/tags | jq '.models[] | select(.name == "nomic-embed-text:latest")'
{
"name": "nomic-embed-text:latest",
"size": 274302450,
"digest": "0a109f422b47...",
"details": {
"family": "nomic-bert",
"parameter_size": "137M",
"quantization_level": "F16"
}
}Index Operation Success
The indexing operation completes without errors and reports:
- Files indexed: 174
- Chunks created: 1,983
- Status: "fully indexed and ready" (per
get_indexing_status)
Search Operation Failure
Every search attempt fails with the same error, regardless of:
- Query content (tried multiple semantic queries)
- Limit parameter (tried 5, 10, 20)
- Time delay after indexing (tried immediately and after minutes)
Suspected Root Cause
State synchronization bug between indexing and search operations:
- Indexing path: MCP server → embeddings → Milvus write → success ✅
- Search path: MCP server → (state check fails) → error ❌
The indexing operation appears to:
- Successfully generate embeddings via Ollama
- Successfully write to Milvus collections
- Successfully update internal state (hence
get_indexing_statusworks)
The search operation appears to:
- Fail to find/recognize the indexed codebase path
- Use a different state check than
get_indexing_status - Not check Milvus directly (otherwise would find the data)
Possible issues:
- Path normalization inconsistency (absolute vs relative, trailing slash, symlinks)
- State stored in different location/format between index and search
- Collection naming mismatch in Milvus
- MCP server restart clears search state but not index state
Workarounds Attempted
- ❌ Re-indexing: Same failure pattern
- ❌ Session restart: Same failure pattern
- ❌ Different query terms: Same error
- ❌ Clearing and re-indexing: Not attempted (no
clear_indexsuccess confirmed) - ❌ Path variations: Only tried absolute path (as recommended)
Impact
This bug completely blocks the primary use case of claude-context MCP:
- ✅ Indexing works (can build vector database)
- ❌ Search broken (cannot retrieve indexed code)
- ❌ RAG functionality unavailable
- ❌ Token reduction benefits unrealized (~40% reduction target)
Additional Context
Use Case
Semantic code search across large codebase (174 files) to:
- Reduce token usage in Claude Code CLI
- Enable efficient code discovery
- Support RAG-based development workflow
Codebase Details
- Type: Python/Docker microservices architecture
- Size: 174 files, ~1,983 code chunks after AST splitting
- Structure: Multi-service Docker Compose application with agents, tools, services
- Language: Primarily Python, YAML, Markdown, shell scripts
Related Issues
- Known Milvus v2.5.7 BM25 bug (hybrid mode disabled as workaround)
- No other known MCP server issues
Requested Information
If maintainers need additional debugging info, I can provide:
- Full MCP server logs (if available)
- Milvus collection inspection results
- Network traffic captures between MCP server and Milvus
- Detailed path normalization comparison
- Alternative path formats tested
Proposed Fix
Suggested investigation areas:
- Path handling: Ensure consistent normalization between index and search
- State storage: Verify index state is accessible to search operation
- Collection naming: Confirm Milvus collection names match between operations
- Error messaging: If path not found, provide more specific error (vs generic "not indexed")
Labels: bug, search, high-priority, mcp-server