Model Selection
Available Model Options in NinjaCat
Anthropic Models
| Model | Release Date | Context Window | Input $ / 1M | Output $ / 1M | Notes |
|---|---|---|---|---|---|
| Claude Opus 4.8 (New) | Jun 2026 | ~1M | ~$5 | ~$25 | Latest Opus model from Anthropic; sharper reasoning and stronger performance across complex tasks. |
| Claude Opus 4.7 | Apr 2026 | ~1M | ~$5 | ~$25 | Latest Opus model; expanded ~1M context window |
| Claude Opus 4.7 - Thinking | Apr 2026 | ~1M | ~$5 | ~$25 | Enhanced reasoning variant of Opus 4.7; uses more output tokens due to adaptive thinking |
| Claude Opus 4.6 | Feb 2026 | ~200K | ~$5 | ~$25 | Superseded by Claude Opus 4.7; although costs are the same, less tokens are used on 4.6 so should still be used for less complex tasks. |
| Claude Opus 4.6 - Thinking | Feb 2026 | ~200K | ~$5 | ~$25 | Enhanced reasoning variant of Opus 4.6 |
| Claude Sonnet 4.6 (Default) | Feb 2026 | ~200K | ~$3 | ~$15 | New default for all newly created Agents. Best-balanced Claude model — strong performance at moderate cost |
| Claude Sonnet 4.6 - Thinking | Feb 2026 | ~200K | ~$3 | ~$15 | Enhanced reasoning variant of Sonnet 4.6 |
| Claude Haiku 4.5 | Oct 2025 | ~200K | ~$1 | ~$5 | Fastest & most affordable Claude option |
| Claude Haiku 4.5 - Thinking | Oct 2025 | ~200K | ~$1 | ~$5 | Enhanced reasoning variant of Haiku 4.5 |
| Claude Fable 5 (No longer available) | Jun 2026 | ~1M | ~$10 | ~$50 | ⚠️ No longer available — See Anthropic's announcement for details. |
OpenAI Models
| Model | Release Date | Context Window | Input $ / 1M | Output $ / 1M | Notes |
|---|---|---|---|---|---|
| GPT-5.5 (New) | Jun 2026 | ~1M | ~$5.00 | ~$30 | OpenAI's latest model; improvements to speed and instruction |
| GPT-5.4 - Instant (New) | Mar 2026 | ~1M | ~$2.50 | ~$15 | 33% fewer errors vs GPT-5.2, massive 1M context window |
| GPT-5.4 - Thinking (New) | Mar 2026 | ~1M | ~$2.50 | ~$15 | Enhanced reasoning variant of GPT-5.4; most token-efficient reasoning model |
| GPT-5.2 - Thinking | Oct 2025 | ~400K | ~$1.75 | ~$14 | Strong reasoning + long context. Default reassignment for agents on removed OpenAI models |
| GPT-5.2 - Instant | Oct 2025 | ~400K | ~$1.75 | ~$14 | Faster, lower-latency variant of GPT-5.2 |
| GPT-5 Mini | Aug 2025 | ~400K | ~$0.25 | ~$2 | Cost-efficient; good for well-defined tasks at lower cost |
| GPT-5 Nano | Aug 2025 | ~400K | ~$0.05 | ~$0.40 | Cheapest & fastest GPT-5 variant; great for summarization and classification workloads |
Google Models
| Model | Release Date | Context Window | Input $ / 1M | Output $ / 1M | Notes |
|---|---|---|---|---|---|
| Gemini 3.5 Flash (New) | Jun 2026 | ~1M | ~$1.50 | ~$9.00 | Google's efficiency-optimized frontier model designed for high-volume, agentic, and coding workloads |
| Gemini 3.1 Pro - Low | Feb 2026 | ~1M | ~$2.00 | ~$12.00 | Google's latest Pro model; improved reasoning, multimodal, and agentic capabilities |
| Gemini 3.1 Pro - High | Feb 2026 | ~1M | ~$2.00 | ~$12.00 | Higher reasoning effort variant of Gemini 3.1 Pro |
| Gemini 3.1 Flash Lite - Low | Feb 2026 | ~1M | ~$0.25 | ~$1.50 | Most cost-efficient Google model; optimized for high-volume agentic tasks |
| Gemini 3.1 Flash Lite - High | Feb 2026 | ~1M | ~$0.25 | ~$1.50 | Higher reasoning effort variant of Gemini 3.1 Flash Lite |
| Gemini 3 Pro - Low | Nov 2025 | ~1M | ~$2.00 | ~$12.00 | Best-in-class reasoning & multimodal from Google; massive context window |
| Gemini 3 Pro - High | Nov 2025 | ~1M | ~$2.00 | ~$12.00 | Higher reasoning effort variant |
| Gemini 3 Flash - Low | Dec 2025 | ~1M | ~$0.50 | ~$3.00 | Fast, efficient; combines Gemini 3 Pro reasoning with Flash-level latency and cost |
| Gemini 3 Flash - High | Dec 2025 | ~1M | ~$0.50 | ~$3.00 | Higher reasoning effort Flash variant |
Note: In the Agent Builder, some models offer both a standard and a "Thinking" variant. The Thinking variant supports deeper reasoning but may come with higher cost and latency. For most agents, the standard variant is recommended unless your use case requires complex multi-step reasoning.
How to Choose the Right Model
AI models are continuously improving — what is "best" today may be surpassed in weeks or months. NinjaCat will continue evaluating and adding models that demonstrate better intelligence, efficiency, or performance.
General guidance:
- For most agents: Claude Sonnet 4.6 (default) is the best starting point — strong performance at reasonable cost.
- For the most complex, high-effort, or long-horizon tasks: Claude Opus 4.8 or Opus 4.7 — Anthropic's highest-capability models available on the platform. (Note: Claude Fable 5 is currently unavailable due to a government directive — see the Anthropic models table above.)
- For complex reasoning or coding tasks: Claude Opus 4.7, GPT-5.4 - Thinking, or GPT-5.2 - Thinking.
- For speed or cost-sensitive tasks: Claude Haiku 4.5, GPT-5 Nano, Gemini 3.1 Flash Lite, or Gemini 3 Flash.
- For large context windows: OpenAI GPT-5.4 series (~1M) or Google Gemini series (~1M).
For the latest information from each provider, see their documentation:
Anthropic Claude Models OpenAI GPT-5 Prompting Guide Google Gemini
Note: When switching between models, prompt adjustments may be required to maintain optimal Agent performance. We will provide further guidance on prompt modifications as we continue testing and learning.