Skip to content

Conversation

@YashAgarwal06
Copy link

@YashAgarwal06 YashAgarwal06 commented Aug 2, 2025

Addresses #1118

Changes Made

  • Migrated Gemma-3 models (1b, 4b, 12b, 27b) from local to Google AI Studio API inference
  • Added support for Gemma-3n models (E2B, E4B) via Google AI Studio
  • Created dedicated GemmaHandler for API inference that converts system prompts to user prompts and disables thinking features (not supported by Gemma models)
  • Updated SUPPORTED_MODELS.md to reflect provider change from "Self-hosted 💻" to "Google"
  • Updated model identifiers from google/gemma-* to gemma-* to match Google AI Studio naming conventions
  • Set pricing to None for all Gemma models (open-source, free via Google AI Studio)
  • Preserved local inference option as fallback (see migration instructions below)

Models Added/Migrated

  • gemma-3-1b-it
  • gemma-3-4b-it
  • gemma-3-12b-it
  • gemma-3-27b-it
  • gemma-3n-e2b-it (new)
  • gemma-3n-e4b-it (new)

Important Concerns to Note

Google AI Studio Limitations for Gemma-3n Models

Based on testing, the Gemma-3n models have some issues in Google AI Studio:

  1. Usage Reporting Issues:
    • I suspect that gemma-3n-e4b-it appears as gemma-3-4b-it in usage statistics (despite being different models, it groups gemma-3n-e4b-it with gemma-3-4b-it)
    • I also suspect that gemma-3n-e2b-it appears as gemma-3-2b-it in usage statistics (even though gemma-3-2b-it doesn't exist)
    • This causes incorrect grouping and representation in Google AI Studio's usage dashboard (see image below)
image
  1. Multimodal Functionality Broken - Google AI Studio API Reliability Concern:
    • Despite being multimodal models, image/audio processing doesn't work for the gemma-3n models through Google AI Studio API
    • This has been publicly acknowledged from google support (ref)
    • Testing confirmed: gemma-3n-e4b-it returns ERROR: 400 INVALID_ARGUMENT: Image input modality is not enabled for models/gemma-3n-e4b-it
    • Regular gemma-3-4b-it works correctly with images (this also confirms that the API is indeed calling different models despite showing usage for gemma-3n-e4b-it as the same as gemma-3-4b-it)
    • Google AI Studio multimodal support for regular Gemma models also didn't work but was resolved (ref)
    • This means that running the gemma models through API inference via Google AI Studio might not be fully reliable yet

Verification Method: Tested models gemma-3n-e4b-it, gemma-3-4b-it, and gemma-3n-e2b-it with identical image inputs - error response (shown above) for gemma-3n-e4b-it and success for gemma-3-4b-it confirmed they call separate APIs despite UI grouping issues in Google AI Studio dashboard. gemma-3n-e2b-it also had the same error for handling images: ERROR: 400 INVALID_ARGUMENT: Image input modality is not enabled for models/gemma-3n-e2b-it confirming that it is indeed trying to use gemma-3n-e2b-it despite it showing up as gemma-3-2b-it on the Google AI Studio usage dashboard. This also confirms multimodal issues across the Gemma 3n models.

Fallback to Local Inference

Local inference remains available for users who need full functionality or encounter API issues:

To switch back to local inference:

  1. Change import in bfcl_eval/constants/model_config.py:

    # From:
    from bfcl_eval.model_handler.api_inference.gemma import GemmaHandler
    
    # To:
    from bfcl_eval.model_handler.local_inference.gemma import GemmaHandler
  2. Move Gemma model configurations from api_inference_model_map to local_inference_model_map

  3. Change naming back to google/gemma-* (check Note below)

Notes

  • Pricing: Set to None (open-source models, free via Google AI Studio).
  • Naming: Changed from google/gemma-* (HuggingFace convention) to gemma-* (Google AI Studio requirement)
  • Handler: Custom GemmaHandler handles Gemma-specific limitations (no system instructions, no thinking features)

Final Note

While this PR enables API access to Gemma models, users requiring multimodal capabilities should continue using local inference until Google resolves the AI Studio limitations for Gemma-3n models. Overall the reliability through Google AI Studio is questionable.

- Move Gemma-3 models (1b, 4b, 12b, 27b) from local to API inference
- Change handler from GemmaHandler to GeminiHandler
- Update documentation - supported models list
- Add google/gemma-3n-e2b-it and google/gemma-3n-e4b-it
- Complete migration for all Gemma models listed in issue ShishirPatil#1118
- Update supported models list

Addresses: ShishirPatil#1118
- Add GemmaHandler class extending GeminiHandler with Gemma-specific adaptations
- Remove thinking_config support (not available for Gemma models)
- Convert system prompts to user prompts (Gemma doesn't support system instructions)
- Update 6 Gemma model configurations to use new handler
- Remove "google/" prefix from model names to match Google AI Studio API naming convention
- Update supported models list to reflect API-compatible naming

Affected models: gemma-3-1b-it, gemma-3-4b-it, gemma-3-12b-it,
gemma-3-27b-it, gemma-3n-e2b-it, gemma-3n-e4b-it
@ShishirPatil
Copy link
Owner

Thanks for the PR @YashAgarwal06 ! Quick question - is there any reason we would want to port to hosted API (vs) local-inference? If the local inference is working as is, and given the model sizes are quite small, I'm tempted to retain local-inference. Any benefits of going to hosted for gemma?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants