Ranger-mini: Open Model for Agentic tool use evaluation


ranger-mini, a fine-tuned sequence classification model, inspects tool calls then returns a single, precise label describing the error or confirming the call is valid. It is optimized for low latency and high accuracy.



What Ranger-mini does

ranger-mini evaluates whether a proposed function call:

  • Chooses the correct tool for the user intent,
  • Uses the exact parameter names required by the tool schema,
  • Supplies correctly formatted and accurate parameter values.

It returns one of four labels:

  • VALID_CALL The tool name, parameters, and values are correct, or no tool is needed.
  • TOOL_ERROR The tool name is missing or does not match the user intent.
  • PARAM_NAME_ERROR Parameter names are missing, extra, or mismatched.
  • PARAM_VALUE_ERROR Parameter names match, but values are incorrect or malformed.

Benchmarks

Model #Params Avg. Latency Avg Binary Accuracy Qualifire TSQ Benchmark Binary Accuracy Limbic Benchmark Binary Accuracy
qualifire/mcp-tool-use-quality-ranger-4b [private] 4B 0.30[sec] 0.978 0.997 0.960
qualifire/mcp-tool-use-quality-ranger-0.6b 0.6B 0.09[sec] 0.958 0.993 0.924
gemini-2.5-flash 4.87[sec] 0.890 0.936 0.845
quotientai/limbic-tool-use-0.5B-32K 0.5B 0.79[sec] 0.807 0.752 0.862

Key Insight: Ranger-mini delivers near state-of-the-art tool-use accuracy at a fraction of latency and model size, making it practical for production guardrails.



Usage

from transformers import AutoModelForSequenceClassification, AutoTokenizer, pipeline
import torch
from huggingface_hub import hf_hub_download

# Model name
model_name = "qualifire/mcp-tool-use-quality-ranger-0.6b"

# Map raw labels to human-readable labels
map_id_to_label = {
    'LABEL_0': 'VALID_CALL',
    'LABEL_1': 'TOOL_ERROR',
    'LABEL_2': 'PARAM_NAME_ERROR',
    'LABEL_3': 'PARAM_VALUE_ERROR'
}

# Load model and tokenizer
model = AutoModelForSequenceClassification.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map='auto',
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Create pipeline
pipe = pipeline("text-classification", model=model, tokenizer=tokenizer)

# Load prompt template
file_path = hf_hub_download(repo_id=model_name, filename="tsq_prompt_template.txt")
with open(file_path, encoding="utf-8") as f:
    PROMPT_TEMPLATE = f.read()

# Example inputs
example_tools_list = '''[
  {
    "type": "function",
    "function": {
      "name": "send-email",
      "description": "Send an email using Resend",
      "parameters": {
        "properties": {
          "to": {
            "type": "string",
            "format": "email",
            "description": "Recipient email address"
          },
          "content": {
            "type": "string",
            "description": "Plain text email content"
          },
          "subject": {
            "type": "string",
            "description": "Email subject line"
          },
          "scheduledAt": {
            "type": "string",
            "description": "Optional parameter to schedule the email. This uses natural language. Examples would be 'tomorrow at 10am' or 'in 2 hours' or 'next day at 9am PST' or 'Friday at 3pm ET'."
          }
        },
        "required": ["to", "subject", "content"]
      }
    }
  }
]'''

example_message_history = '''[
  {
    "role": "user",
    "content": "Please send an email to 'jane.doe@example.com' with the subject 'Meeting Follow-Up'. The content should be 'Hi Jane, just following up on our meeting from yesterday. Please find the attached notes.' and schedule it for tomorrow at 10am."
  },
  {
    "completion_message": {
      "content": {
        "type": "text",
        "text": ""
      },
      "role": "assistant",
      "stop_reason": "tool_calls",
      "tool_calls": [
        {
          "id": "call_le25efmhltxx9o7n4rfe",
          "function": {
            "name": "send-email",
            "arguments": {
              "subject": "Meeting Follow-Up",
              "content": "Hi Jane, just following up on our meeting from yesterday. Please find the attached notes.",
              "scheduledAt": "tomorrow at 10am"
            }
          }
        }
      ]
    }
  }
]'''

# Format input
example_input = PROMPT_TEMPLATE.format(
    message_history=example_message_history,
    available_tools=example_tools_list
)

# Get prediction
result = pipe(example_input)[0]
result['label'] = map_id_to_label[result['label']]
print(result)


Enter fullscreen mode

Exit fullscreen mode



✨ Example Output

{'label': 'PARAM_NAME_ERROR', 'score': 0.9999843835830688}
Enter fullscreen mode

Exit fullscreen mode



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *