Skip to content

claude-context MCP search_code fails despite successful indexing #226

@joehays

Description

@joehays

Summary

The search_code MCP tool consistently fails with "codebase not indexed" error despite:

  1. Successful completion of index_codebase operation
  2. Confirmation from get_indexing_status that codebase is "fully indexed and ready"
  3. Verified data stored in Milvus vector database

This appears to be a state synchronization bug where the indexing operation succeeds but the search operation cannot access the indexed data.


Environment

MCP Configuration

{
  "mcpServers": {
    "claude-context": {
      "command": "npx",
      "args": ["@zilliz/claude-context-mcp@latest"],
      "env": {
        "EMBEDDING_PROVIDER": "Ollama",
        "OLLAMA_HOST": "http://localhost:11434",
        "EMBEDDING_MODEL": "nomic-embed-text",
        "MILVUS_ADDRESS": "localhost:19530",
        "HYBRID_MODE": "false",
        "EMBEDDING_BATCH_SIZE": "100",
        "SPLITTER_TYPE": "ast"
      }
    }
  }
}

Infrastructure

  • Vector Database: Milvus v2.5.7 (standalone deployment)
    • 3 containers: etcd v3.5.18, minio RELEASE.2024-05-28T17-19-04Z, milvus v2.5.7
    • All containers healthy and responding
    • gRPC endpoint: localhost:19530 (accessible)
    • HTTP health endpoint: localhost:19121/healthz (returns 200 OK)
  • Embeddings: Ollama running on localhost:11434
    • Model: nomic-embed-text:latest (137M params, F16 quantization)
    • API responding correctly (/api/tags returns model list)
  • Operating System: Linux 5.15.0-139-generic
  • Client: Claude Code CLI (Sonnet 4.5)

Configuration Notes

  • HYBRID_MODE: false - Dense vector search only (BM25 disabled due to known Milvus v2.5.7 BM25 bug)
  • SPLITTER_TYPE: ast - AST-based code chunking for syntax-aware splitting
  • EMBEDDING_BATCH_SIZE: 100 - Batch size for embedding generation

Steps to Reproduce

  1. Start infrastructure:

    # Start Milvus containers
    docker compose up -d milvus-etcd milvus-minio milvus
    
    # Verify all healthy
    docker ps --filter "name=my-milvus"
  2. Configure MCP server (via claude mcp add or config file as shown above)

  3. Restart Claude Code CLI to load MCP configuration

  4. Index codebase:

    Tool: mcp__claude-context__index_codebase
    Parameters: {
      "path": "<path-to-repo>",
      "splitter": "ast"
    }
    

    Result: ✅ Success - "Indexed 174 files, 1,983 chunks"

  5. Verify indexing status:

    Tool: mcp__claude-context__get_indexing_status
    Parameters: {
      "path": "<path-to-repo>"
    }
    

    Result: ✅ Success - "Codebase is fully indexed and ready"

  6. Attempt search:

    Tool: mcp__claude-context__search_code
    Parameters: {
      "path": "<path-to-repo>",
      "query": "docker compose fragment architecture",
      "limit": 10
    }
    

    Result: ❌ FAILURE - "The codebase at path is not indexed"


Expected Behavior

After successful indexing (confirmed by get_indexing_status), the search_code tool should:

  1. Accept the query
  2. Search the indexed codebase in Milvus
  3. Return relevant code chunks matching the query

Actual Behavior

search_code immediately fails with error:

The codebase at path <path-to-repo> is not indexed. Please use the index_codebase tool first.

This error is inconsistent with reality because:

  • index_codebase completed successfully
  • get_indexing_status confirms "fully indexed and ready"
  • Milvus container logs show successful operations
  • Re-indexing produces the same failure pattern

Reproduction Rate

100% reproducible across:

  • Multiple indexing attempts
  • Multiple Claude Code CLI session restarts
  • Multiple search queries with different terms
  • Multiple days of testing

Investigation Details

Milvus Container Health

$ docker ps --filter "name=my-milvus"
CONTAINER ID   IMAGE                               STATUS
a1b2c3d4e5f6   milvusdb/milvus:v2.5.7             Up (healthy)
f6e5d4c3b2a1   minio/minio:RELEASE.2024-05-28...  Up (healthy)
6f5e4d3c2b1a   quay.io/coreos/etcd:v3.5.18        Up (healthy)

$ curl -s http://localhost:19121/healthz
{"status":"ok"}

Ollama Embedding Service

$ curl -s http://localhost:11434/api/tags | jq '.models[] | select(.name == "nomic-embed-text:latest")'
{
  "name": "nomic-embed-text:latest",
  "size": 274302450,
  "digest": "0a109f422b47...",
  "details": {
    "family": "nomic-bert",
    "parameter_size": "137M",
    "quantization_level": "F16"
  }
}

Index Operation Success

The indexing operation completes without errors and reports:

  • Files indexed: 174
  • Chunks created: 1,983
  • Status: "fully indexed and ready" (per get_indexing_status)

Search Operation Failure

Every search attempt fails with the same error, regardless of:

  • Query content (tried multiple semantic queries)
  • Limit parameter (tried 5, 10, 20)
  • Time delay after indexing (tried immediately and after minutes)

Suspected Root Cause

State synchronization bug between indexing and search operations:

  1. Indexing path: MCP server → embeddings → Milvus write → success ✅
  2. Search path: MCP server → (state check fails) → error ❌

The indexing operation appears to:

  • Successfully generate embeddings via Ollama
  • Successfully write to Milvus collections
  • Successfully update internal state (hence get_indexing_status works)

The search operation appears to:

  • Fail to find/recognize the indexed codebase path
  • Use a different state check than get_indexing_status
  • Not check Milvus directly (otherwise would find the data)

Possible issues:

  • Path normalization inconsistency (absolute vs relative, trailing slash, symlinks)
  • State stored in different location/format between index and search
  • Collection naming mismatch in Milvus
  • MCP server restart clears search state but not index state

Workarounds Attempted

  1. Re-indexing: Same failure pattern
  2. Session restart: Same failure pattern
  3. Different query terms: Same error
  4. Clearing and re-indexing: Not attempted (no clear_index success confirmed)
  5. Path variations: Only tried absolute path (as recommended)

Impact

This bug completely blocks the primary use case of claude-context MCP:

  • ✅ Indexing works (can build vector database)
  • ❌ Search broken (cannot retrieve indexed code)
  • ❌ RAG functionality unavailable
  • ❌ Token reduction benefits unrealized (~40% reduction target)

Additional Context

Use Case

Semantic code search across large codebase (174 files) to:

  • Reduce token usage in Claude Code CLI
  • Enable efficient code discovery
  • Support RAG-based development workflow

Codebase Details

  • Type: Python/Docker microservices architecture
  • Size: 174 files, ~1,983 code chunks after AST splitting
  • Structure: Multi-service Docker Compose application with agents, tools, services
  • Language: Primarily Python, YAML, Markdown, shell scripts

Related Issues

  • Known Milvus v2.5.7 BM25 bug (hybrid mode disabled as workaround)
  • No other known MCP server issues

Requested Information

If maintainers need additional debugging info, I can provide:

  1. Full MCP server logs (if available)
  2. Milvus collection inspection results
  3. Network traffic captures between MCP server and Milvus
  4. Detailed path normalization comparison
  5. Alternative path formats tested

Proposed Fix

Suggested investigation areas:

  1. Path handling: Ensure consistent normalization between index and search
  2. State storage: Verify index state is accessible to search operation
  3. Collection naming: Confirm Milvus collection names match between operations
  4. Error messaging: If path not found, provide more specific error (vs generic "not indexed")

Labels: bug, search, high-priority, mcp-server

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions