Model Selection


Available Model Options in NinjaCat

Anthropic Models

ModelRelease DateContext WindowInput $ / 1MOutput $ / 1MNotes
Claude Opus 4.8 (New)Jun 2026~1M~$5~$25Latest Opus model from Anthropic; sharper reasoning and stronger performance across complex tasks.
Claude Opus 4.7Apr 2026~1M~$5~$25Latest Opus model; expanded ~1M context window
Claude Opus 4.7 - ThinkingApr 2026~1M~$5~$25Enhanced reasoning variant of Opus 4.7; uses more output tokens due to adaptive thinking
Claude Opus 4.6Feb 2026~200K~$5~$25Superseded by Claude Opus 4.7; although costs are the same, less tokens are used on 4.6 so should still be used for less complex tasks.
Claude Opus 4.6 - ThinkingFeb 2026~200K~$5~$25Enhanced reasoning variant of Opus 4.6
Claude Sonnet 4.6 (Default)Feb 2026~200K~$3~$15New default for all newly created Agents. Best-balanced Claude model — strong performance at moderate cost
Claude Sonnet 4.6 - ThinkingFeb 2026~200K~$3~$15Enhanced reasoning variant of Sonnet 4.6
Claude Haiku 4.5Oct 2025~200K~$1~$5Fastest & most affordable Claude option
Claude Haiku 4.5 - ThinkingOct 2025~200K~$1~$5Enhanced reasoning variant of Haiku 4.5
Claude Fable 5 (No longer available)Jun 2026~1M~$10~$50⚠️ No longer available — See Anthropic's announcement for details.

OpenAI Models

ModelRelease DateContext WindowInput $ / 1MOutput $ / 1MNotes
GPT-5.5 (New)Jun 2026~1M~$5.00~$30OpenAI's latest model; improvements to speed and instruction
GPT-5.4 - Instant (New)Mar 2026~1M~$2.50~$1533% fewer errors vs GPT-5.2, massive 1M context window
GPT-5.4 - Thinking (New)Mar 2026~1M~$2.50~$15Enhanced reasoning variant of GPT-5.4; most token-efficient reasoning model
GPT-5.2 - ThinkingOct 2025~400K~$1.75~$14Strong reasoning + long context. Default reassignment for agents on removed OpenAI models
GPT-5.2 - InstantOct 2025~400K~$1.75~$14Faster, lower-latency variant of GPT-5.2
GPT-5 MiniAug 2025~400K~$0.25~$2Cost-efficient; good for well-defined tasks at lower cost
GPT-5 NanoAug 2025~400K~$0.05~$0.40Cheapest & fastest GPT-5 variant; great for summarization and classification workloads

Google Models

ModelRelease DateContext WindowInput $ / 1MOutput $ / 1MNotes
Gemini 3.5 Flash (New)Jun 2026~1M~$1.50~$9.00Google's efficiency-optimized frontier model designed for high-volume, agentic, and coding workloads
Gemini 3.1 Pro - LowFeb 2026~1M~$2.00~$12.00Google's latest Pro model; improved reasoning, multimodal, and agentic capabilities
Gemini 3.1 Pro - HighFeb 2026~1M~$2.00~$12.00Higher reasoning effort variant of Gemini 3.1 Pro
Gemini 3.1 Flash Lite - LowFeb 2026~1M~$0.25~$1.50Most cost-efficient Google model; optimized for high-volume agentic tasks
Gemini 3.1 Flash Lite - HighFeb 2026~1M~$0.25~$1.50Higher reasoning effort variant of Gemini 3.1 Flash Lite
Gemini 3 Pro - LowNov 2025~1M~$2.00~$12.00Best-in-class reasoning & multimodal from Google; massive context window
Gemini 3 Pro - HighNov 2025~1M~$2.00~$12.00Higher reasoning effort variant
Gemini 3 Flash - LowDec 2025~1M~$0.50~$3.00Fast, efficient; combines Gemini 3 Pro reasoning with Flash-level latency and cost
Gemini 3 Flash - HighDec 2025~1M~$0.50~$3.00Higher reasoning effort Flash variant

Note: In the Agent Builder, some models offer both a standard and a "Thinking" variant. The Thinking variant supports deeper reasoning but may come with higher cost and latency. For most agents, the standard variant is recommended unless your use case requires complex multi-step reasoning.


How to Choose the Right Model

AI models are continuously improving — what is "best" today may be surpassed in weeks or months. NinjaCat will continue evaluating and adding models that demonstrate better intelligence, efficiency, or performance.

General guidance:

  • For most agents: Claude Sonnet 4.6 (default) is the best starting point — strong performance at reasonable cost.
  • For the most complex, high-effort, or long-horizon tasks: Claude Opus 4.8 or Opus 4.7 — Anthropic's highest-capability models available on the platform. (Note: Claude Fable 5 is currently unavailable due to a government directive — see the Anthropic models table above.)
  • For complex reasoning or coding tasks: Claude Opus 4.7, GPT-5.4 - Thinking, or GPT-5.2 - Thinking.
  • For speed or cost-sensitive tasks: Claude Haiku 4.5, GPT-5 Nano, Gemini 3.1 Flash Lite, or Gemini 3 Flash.
  • For large context windows: OpenAI GPT-5.4 series (~1M) or Google Gemini series (~1M).

For the latest information from each provider, see their documentation:

Anthropic Claude Models OpenAI GPT-5 Prompting Guide Google Gemini

Note: When switching between models, prompt adjustments may be required to maintain optimal Agent performance. We will provide further guidance on prompt modifications as we continue testing and learning.