[BFCL] Add model star-lab/STAR-0b6, star-lab/STAR-1b7 and star-lab/STAR-4b #1226

jn2707 · 2025-10-22T08:58:25Z

STAR series models are highly capable language model specialized in function calling, achieving excellent performances on the Berkeley Function Calling Leaderboard (BFCL) for models in their size classes.

These models are the results of fine-tuning the Qwen/Qwen3-0.6B, Qwen/Qwen3-1.7B and Qwen/Qwen3-4B base models using the novel STAR (Similarity-guided Teacher-Assisted Refinement) framework. STAR is a holistic training curriculum designed to effectively transfer the advanced capabilities of large language models (LLMs) into "super-tiny" models, making them powerful, accessible, and efficient for real-world agentic applications.

The key innovations of the STAR framework include:

Similarity-guided RL (Sim-RL): A reinforcement learning mechanism that uses a fine-grained, similarity-based reward signal. This provides a more robust and continuous signal for policy optimization compared to simple binary rewards, which is crucial for complex, multi-solution tasks like function calling.
Constrained Knowledge Distillation (CKD): An advanced training objective that augments top-k forward KL divergence to suppress confidently incorrect predictions. This ensures training stability while preserving the model's exploration capacity, creating a strong foundation for the subsequent RL phase.
Notably, our STAR-0b6 model significantly outperforms other open models under 1B parameters and even surpasses several larger models, demonstrating the effectiveness of the STAR methodology.

STAR-0b6, STAR-1b7 and STAR-4b have achieved outstanding performances for models of their sizes on BFCLv4 (not including Web Search metric).

Metric	STAR-4B(FC)	STAR-1.7B(FC)	STAR-0.6B(FC)
Overall Acc	36.91%	30.00%	26.09%
Model	STAR-4B(FC)	STAR-1.7B(FC)	STAR-0.6B(FC)
Non-Live AST Acc	89.42%	84.94%	79.48%
Non-Live Simple AST	75.17%	74.25%	71.42%
Non-Live Multiple AST	96.50%	92.00%	89.50%
Non-Live Parallel AST	93.00%	87.00%	80.50%
Non-Live Parallel Multiple AST	93.00%	86.50%	76.50%
Live Acc	78.98%	68.91%	59.36%
Live Simple AST	84.50%	79.46%	65.50%
Live Multiple AST	77.78%	66.67%	58.02%
Live Parallel AST	75.00%	50.00%	43.75%
Live Parallel Multiple AST	75.00%	66.67%	62.50%
Multi Turn Acc	25.88%	10.25%	6.75%
Multi Turn Base	32.00%	15.00%	8.50%
Multi Turn Miss Func	27.00%	9.50%	6.50%
Multi Turn Miss Param	24.50%	11.00%	7.00%
Multi Turn Long Context	20.00%	5.50%	5.00%
Web Search Acc	N/A	N/A	N/A
Web Search Base	N/A	N/A	N/A
Web Search No Snippet	N/A	N/A	N/A
Memory Acc	18.92%	17.20%	9.03%
Memory KV	1.94%	5.81%	1.29%
Memory Vector	13.55%	13.55%	5.81%
Memory Recursive Summarization	41.29%	32.26%	20.00%
Relevance Detection	81.25%	75.00%	81.25%
Irrelevance Detection	85.23%	80.96%	83.73%
Format Sensitivity Max Delta	N/A	N/A	N/A
Format Sensitivity Standard Deviation	N/A	N/A	N/A

jn2707 · 2025-10-28T03:32:17Z

@HuanzhiMao Hi! Can you have a look when you get a chance? Appreciate it!

Add model_config and star handler files of star models

1538ff6

jn2707 changed the title ~~[BFCL] Add model "star-lab/STAR-0b6", "star-lab/STAR-1b7" and "star-lab/STAR-4b"~~ [BFCL] Add model star-lab/STAR-0b6, star-lab/STAR-1b7 and star-lab/STAR-4b Oct 22, 2025

Merge branch 'main' into feat/add-star-models

b880b3d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BFCL] Add model star-lab/STAR-0b6, star-lab/STAR-1b7 and star-lab/STAR-4b #1226

[BFCL] Add model star-lab/STAR-0b6, star-lab/STAR-1b7 and star-lab/STAR-4b #1226

Uh oh!

jn2707 commented Oct 22, 2025 •

edited

Loading

Uh oh!

jn2707 commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[BFCL] Add model star-lab/STAR-0b6, star-lab/STAR-1b7 and star-lab/STAR-4b #1226

Are you sure you want to change the base?

[BFCL] Add model star-lab/STAR-0b6, star-lab/STAR-1b7 and star-lab/STAR-4b #1226

Uh oh!

Conversation

jn2707 commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jn2707 commented Oct 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jn2707 commented Oct 22, 2025 •

edited

Loading