Source: OCI Official Documentation | March 2026 | 30+ models
| Model Name | Model ID | Tier | Parameters | Context Window | Multimodal | Fine-tunable | Tool Use / Agents | Reasoning | RAG Optimized | Status | Best For |
|---|---|---|---|---|---|---|---|---|---|---|---|
|
Cohere Command A Reasoning |
cohere.command-a-reasoning | ★ Flagship Reasoning | 111B | 256K | ✗ | ✗ | ✓ | ✓ Advanced | ✓ | ● GA (Aug 2025) | Complex Q&A, multi-step reasoning, document analysis, structured arguments |
|
Cohere Command A Vision |
cohere.command-a-vision | ★ Flagship Multimodal | 112B | 128K | ✓ Images, Charts, Docs | ✗ | ✓ | ✓ | ✓ | ● GA (Jul 2025) | Enterprise document understanding with charts & images |
|
Cohere Command A |
cohere.command-a-03-2025 | ◉ Flagship Chat | 111B | 256K | ✗ | ✗ | ✓ Advanced | — | ✓ | ● GA (Mar 2025) | Agentic enterprise tasks, RAG, multilingual, high-throughput production |
|
Cohere Command R+ 08-2024 |
cohere.command-r-plus-08-2024 | ◉ Advanced | 104B | 128K | ✗ | ✗ | ✓ | — | ✓ | ● Active | Complex specialized tasks, Q&A, sentiment, multilingual RAG |
|
Cohere Command R 08-2024 |
cohere.command-r-08-2024 | ▷ Standard | 35B | 128K | ✗ | ✓ T-Few / Vanilla | ✓ | — | ✓ Optimized | ● Active | RAG pipelines, info retrieval, cost-efficient enterprise chat |
| Model Name | Model ID | Tier | Context Window | Multimodal | Thinking / Reasoning | Speed Profile | Fine-tunable on OCI | Status | Best For |
|---|---|---|---|---|---|---|---|---|---|
|
Google Gemini 2.5 Pro |
google.gemini-2.5-pro | ★ Flagship | 1M+ | ✓ Text, Image, Code, Audio, Video | ✓ Advanced Reasoning | Deep / Accurate | ✗ | ● GA | Most complex multimodal problems, large dataset analysis, SOTA reasoning tasks |
|
Google Gemini 2.5 Flash |
google.gemini-2.5-flash | ◉ Balanced | 1M | ✓ Text, Image, Code, Audio, Video | ✓ Thinking features | Fast + Smart | ✗ | ● GA | Balanced workloads needing speed + intelligence, complex applications |
|
Google Gemini 2.5 Flash-Lite |
google.gemini-2.5-flash-lite | ○ Budget / Fast | 1M | ✓ Text, Image, Code, Audio, Video | ✗ | Lowest Latency | ✗ | ● GA | High-volume, simpler tasks; cost-sensitive production workloads |
| Model Name | Model ID | Tier | Architecture | Total Params | Active Params (MoE) | Context Window | Multimodal | Fine-tunable (LoRA) | Tool Use | Status | Best For |
|---|---|---|---|---|---|---|---|---|---|---|---|
|
Llama 4 Maverick |
meta.llama-4-maverick-17b-128e-instruct-fp8 | ★ Flagship MoE | MoE · 128 Experts | ~400B | 17B active | 512K | ✓ Text + Image | ✗ | ✓ | ● GA (2025) | Multimodal understanding, multilingual, coding, agentic systems, large-scale inference |
|
Llama 4 Scout |
meta.llama-4-scout-17b-16e-instruct | ◉ Efficient MoE | MoE · 16 Experts | ~109B | 17B active | 192K | ✓ Text + Image | ✗ | ✓ | ● GA (2025) | Smaller GPU deployments, efficient multimodal, multilingual, coding |
|
Llama 3.3 70B |
meta.llama-3.3-70b-instruct | ◉ Best Text 70B | Dense | 70B | 70B | 128K | ✗ Text only | ✓ LoRA | ✓ | ● GA | Best text-only 70B tasks; outperforms 3.1 70B and 3.2 90B on text benchmarks |
|
Llama 3.2 90B Vision |
meta.llama-3.2-90b-vision-instruct | ◉ Vision Flagship | Dense + Vision | 90B | 90B | 128K | ✓ Text + Image | ✗ | ✓ | ● Active | Multimodal understanding with large model capacity, image reasoning |
|
Llama 3.2 11B Vision |
meta.llama-3.2-11b-vision-instruct | ▷ Compact Vision | Dense + Vision | 11B | 11B | 128K | ✓ Text + Image | ✗ | ✓ | ● Active (Dedicated only) | Cost-efficient multimodal; resource-constrained deployments |
|
Llama 3.1 405B |
meta.llama-3.1-405b-instruct | ★ Largest Open | Dense | 405B | 405B | 128K | ✗ Text only | ✗ | ✓ | ● Active | Highest text quality open model; complex reasoning, advanced generation |
| Model Name | Model ID | Tier | Parameters | Context Window | Reasoning / Agentic | Tool Use | Open Source | Fine-tunable on OCI | Status | Best For |
|---|---|---|---|---|---|---|---|---|---|---|
|
OpenAI gpt-oss-120b |
openai.gpt-oss-120b | ★ Flagship OSS | 120B | 128K | ✓ Advanced Reasoning + Agentic | ✓ Advanced Tool Use | ✓ Open weights | ✗ | ● GA | Reasoning, agentic tasks; outperforms similar-size open models; OpenAI-compatible API |
|
OpenAI gpt-oss-20b |
openai.gpt-oss-20b | ▷ Efficient OSS | 20B | 128K | ✓ Reasoning + Agentic | ✓ | ✓ Open weights | ✗ | ● GA | Efficient consumer-hardware-optimized reasoning; agentic tasks at lower cost |
| Model Name | Model ID | Tier | Context Window | Multimodal | Thinking / Chain-of-Thought | Coding Focus | Speed Profile | Domain Knowledge | Agentic | Status | Best For |
|---|---|---|---|---|---|---|---|---|---|---|---|
|
xAI Grok 4 |
xai.grok-4 | ★ Flagship | 128K | ✓ Text + Image | ✓ | ✓ | Standard | ✓ Finance, Health, Law, Science | ✓ | ● GA | Enterprise data extraction, coding, summarization with deep domain knowledge |
|
xAI Grok 4 Fast |
xai.grok-4-fast-reasoning xai.grok-4-fast-non-reasoning |
▷ Fast Flagship | 2M | ✓ Text + Image | ✓ | ✓ | Speed Optimized | ✓ Finance, Health, Law, Science | ✓ | ● GA | Same capability as Grok 4 with 2M context; cost-speed-optimized production |
|
xAI Grok 4.1 Fast |
xai.grok-4-1-fast-reasoning xai.grok-4-1-fast-non-reasoning |
★ Agentic Flagship | 2M ! | ✓ Text + Image | ✓ Advanced | ✓ | High Speed | ✓ | ✓ Parallel Tool Calling | ● GA | Complex agentic systems, customer support, research — 3x fewer hallucinations vs Grok 4 |
|
xAI Grok 3 |
xai.grok-3 | ◉ Standard | 131K | ✗ | ✗ | ✓ | Standard | ✓ Finance, Health, Law, Science | ✓ | ● GA | General enterprise tasks, data extraction, text summarization |
|
xAI Grok 3 Fast |
xai.grok-3-fast | ▷ Standard Fast | 131K | ✗ | ✗ | ✓ | Speed Optimized | ✓ | ✓ | ● GA | High-throughput enterprise tasks at Grok 3 quality level |
|
xAI Grok 3 Mini |
xai.grok-3-mini | ○ Lightweight Thinker | 131K | ✗ | ✓ Traces exposed | ✓ | Standard | — | ✓ | ● GA | Logic-based tasks not requiring deep domain knowledge; transparent thinking traces |
|
xAI Grok 3 Mini Fast |
xai.grok-3-mini-fast | ○ Lightweight Fast | 131K | ✗ | ✓ Traces exposed | ✓ | Fastest | — | ✓ | ● GA | Low-latency logic tasks at minimum cost |
|
xAI Grok Code Fast 1 |
xai.grok-code-fast-1 | ◆ Coding Specialist | 256K | ✗ | ✓ Summarized traces | ✓ Specialized | Very Fast | — | ✓ Agentic Coding | ● GA (Aug 2025) | TypeScript, Python, Java, Rust, C++, Go; zero-to-one projects, bug fixes, agentic coding loops |
| Model Name | Model ID | Generation | Multimodal (Image) | Language Scope | Size Variant | Use Case | Status |
|---|---|---|---|---|---|---|---|
Embed 4 |
cohere.embed-v4.0 | Gen 4 · Latest | ✓ Text + Image (base64) | Multilingual | Full | Latest multimodal embeddings; text & image semantic search | ● Active |
Embed English Image 3 |
cohere.embed-english-image-v3.0 | Gen 3 | ✓ Text + Image | English | Full | English-only text+image semantic search | ● Active |
Embed English Light Image 3 |
cohere.embed-english-light-image-v3.0 | Gen 3 | ✓ Text + Image | English | Light | Cost-efficient English text+image embeddings | ● Active |
Embed Multilingual Image 3 |
cohere.embed-multilingual-image-v3.0 | Gen 3 | ✓ Text + Image | Multilingual | Full | Global multilingual text+image semantic search | ● Active |
Embed Multilingual Light Image 3 |
cohere.embed-multilingual-light-image-v3.0 | Gen 3 | ✓ Text + Image | Multilingual | Light | Budget multilingual text+image embeddings | ● Active |
Embed English 3 |
cohere.embed-english-v3.0 | Gen 3 | ✗ Text only | English | Full | Pure text English semantic search, classification, clustering | ● Active |
Embed English Light 3 |
cohere.embed-english-light-v3.0 | Gen 3 | ✗ Text only | English | Light | Cost-efficient English text embeddings at scale | ● Active |
Embed Multilingual 3 |
cohere.embed-multilingual-v3.0 | Gen 3 | ✗ Text only | Multilingual | Full | Global enterprise text semantic search in 100+ languages | ● Active |
Embed Multilingual Light 3 |
cohere.embed-multilingual-light-v3.0 | Gen 3 | ✗ Text only | Multilingual | Light | Affordable multilingual text embeddings at volume | ● Active |
| Model Name | Model ID | Input | Output | Use Case | Status |
|---|---|---|---|---|---|
Rerank 3.5 |
cohere.rerank.v3-5 | Query + List of texts | Ordered array with relevance scores | RAG pipelines, document ranking, search result reordering, precision improvement | ● Active |
¹ Parameter counts are shown only when officially disclosed by the provider. Proprietary models (Google Gemini, xAI Grok) do not publish parameter counts and are omitted. "—" means not publicly disclosed.
² Fine-tuning on OCI uses dedicated AI clusters (GPU resources belonging exclusively to your tenancy). Cohere supports T-Few & Vanilla strategies; Meta Llama supports LoRA.
³ Retired/deprecated models (Command R legacy, Llama 3 70B, Llama 3.1 70B) are omitted from the main tables.
⁴ Model Import feature (GA 2025) lets you bring your own LLMs from Hugging Face or OCI Object Storage.
⁶ Grok 4 Fast and Grok 4.1 Fast each expose two OCI model IDs: a -reasoning variant (thinking tokens, chain-of-thought) and a -non-reasoning variant (instant responses, no thinking tokens).
⁵ Data source: OCI Official Documentation — docs.oracle.com/en-us/iaas/Content/generative-ai/ — March 2026.