OCI · GENERATIVE AI

Model Reference 2026

Source: OCI Official Documentation  |  March 2026  |  30+ models

Provider Context
No models match the selected filters.
5
Model Providers
24
Chat Models (Active)
9
Embedding Models
1
Rerank Model
Chat Models — Cohere Family
Model Name Model ID Tier Parameters Context Window Multimodal Fine-tunable Tool Use / Agents Reasoning RAG Optimized Status Best For
Cohere
Command A Reasoning
cohere.command-a-reasoning ★ Flagship Reasoning 111B 256K ✓ Advanced ● GA (Aug 2025) Complex Q&A, multi-step reasoning, document analysis, structured arguments
Cohere
Command A Vision
cohere.command-a-vision ★ Flagship Multimodal 112B 128K ✓ Images, Charts, Docs ● GA (Jul 2025) Enterprise document understanding with charts & images
Cohere
Command A
cohere.command-a-03-2025 ◉ Flagship Chat 111B 256K ✓ Advanced ● GA (Mar 2025) Agentic enterprise tasks, RAG, multilingual, high-throughput production
Cohere
Command R+ 08-2024
cohere.command-r-plus-08-2024 ◉ Advanced 104B 128K ● Active Complex specialized tasks, Q&A, sentiment, multilingual RAG
Cohere
Command R 08-2024
cohere.command-r-08-2024 ▷ Standard 35B 128K ✓ T-Few / Vanilla ✓ Optimized ● Active RAG pipelines, info retrieval, cost-efficient enterprise chat
Chat Models — Google Gemini Family
Model Name Model ID Tier Context Window Multimodal Thinking / Reasoning Speed Profile Fine-tunable on OCI Status Best For
Google
Gemini 2.5 Pro
google.gemini-2.5-pro ★ Flagship 1M+ ✓ Text, Image, Code, Audio, Video ✓ Advanced Reasoning Deep / Accurate ● GA Most complex multimodal problems, large dataset analysis, SOTA reasoning tasks
Google
Gemini 2.5 Flash
google.gemini-2.5-flash ◉ Balanced 1M ✓ Text, Image, Code, Audio, Video ✓ Thinking features Fast + Smart ● GA Balanced workloads needing speed + intelligence, complex applications
Google
Gemini 2.5 Flash-Lite
google.gemini-2.5-flash-lite ○ Budget / Fast 1M ✓ Text, Image, Code, Audio, Video Lowest Latency ● GA High-volume, simpler tasks; cost-sensitive production workloads
Chat Models — Meta Llama Family
Model Name Model ID Tier Architecture Total Params Active Params (MoE) Context Window Multimodal Fine-tunable (LoRA) Tool Use Status Best For
Meta
Llama 4 Maverick
meta.llama-4-maverick-17b-128e-instruct-fp8 ★ Flagship MoE MoE · 128 Experts ~400B 17B active 512K ✓ Text + Image ● GA (2025) Multimodal understanding, multilingual, coding, agentic systems, large-scale inference
Meta
Llama 4 Scout
meta.llama-4-scout-17b-16e-instruct ◉ Efficient MoE MoE · 16 Experts ~109B 17B active 192K ✓ Text + Image ● GA (2025) Smaller GPU deployments, efficient multimodal, multilingual, coding
Meta
Llama 3.3 70B
meta.llama-3.3-70b-instruct ◉ Best Text 70B Dense 70B 70B 128K ✗ Text only ✓ LoRA ● GA Best text-only 70B tasks; outperforms 3.1 70B and 3.2 90B on text benchmarks
Meta
Llama 3.2 90B Vision
meta.llama-3.2-90b-vision-instruct ◉ Vision Flagship Dense + Vision 90B 90B 128K ✓ Text + Image ● Active Multimodal understanding with large model capacity, image reasoning
Meta
Llama 3.2 11B Vision
meta.llama-3.2-11b-vision-instruct ▷ Compact Vision Dense + Vision 11B 11B 128K ✓ Text + Image ● Active (Dedicated only) Cost-efficient multimodal; resource-constrained deployments
Meta
Llama 3.1 405B
meta.llama-3.1-405b-instruct ★ Largest Open Dense 405B 405B 128K ✗ Text only ● Active Highest text quality open model; complex reasoning, advanced generation
Chat Models — OpenAI gpt-oss Family
Model Name Model ID Tier Parameters Context Window Reasoning / Agentic Tool Use Open Source Fine-tunable on OCI Status Best For
OpenAI
gpt-oss-120b
openai.gpt-oss-120b ★ Flagship OSS 120B 128K ✓ Advanced Reasoning + Agentic ✓ Advanced Tool Use ✓ Open weights ● GA Reasoning, agentic tasks; outperforms similar-size open models; OpenAI-compatible API
OpenAI
gpt-oss-20b
openai.gpt-oss-20b ▷ Efficient OSS 20B 128K ✓ Reasoning + Agentic ✓ Open weights ● GA Efficient consumer-hardware-optimized reasoning; agentic tasks at lower cost
Chat Models — xAI Grok Family
Model Name Model ID Tier Context Window Multimodal Thinking / Chain-of-Thought Coding Focus Speed Profile Domain Knowledge Agentic Status Best For
xAI
Grok 4
xai.grok-4 ★ Flagship 128K ✓ Text + Image Standard ✓ Finance, Health, Law, Science ● GA Enterprise data extraction, coding, summarization with deep domain knowledge
xAI
Grok 4 Fast
xai.grok-4-fast-reasoning
xai.grok-4-fast-non-reasoning
▷ Fast Flagship 2M ✓ Text + Image Speed Optimized ✓ Finance, Health, Law, Science ● GA Same capability as Grok 4 with 2M context; cost-speed-optimized production
xAI
Grok 4.1 Fast
xai.grok-4-1-fast-reasoning
xai.grok-4-1-fast-non-reasoning
★ Agentic Flagship 2M ! ✓ Text + Image ✓ Advanced High Speed ✓ Parallel Tool Calling ● GA Complex agentic systems, customer support, research — 3x fewer hallucinations vs Grok 4
xAI
Grok 3
xai.grok-3 ◉ Standard 131K Standard ✓ Finance, Health, Law, Science ● GA General enterprise tasks, data extraction, text summarization
xAI
Grok 3 Fast
xai.grok-3-fast ▷ Standard Fast 131K Speed Optimized ● GA High-throughput enterprise tasks at Grok 3 quality level
xAI
Grok 3 Mini
xai.grok-3-mini ○ Lightweight Thinker 131K ✓ Traces exposed Standard ● GA Logic-based tasks not requiring deep domain knowledge; transparent thinking traces
xAI
Grok 3 Mini Fast
xai.grok-3-mini-fast ○ Lightweight Fast 131K ✓ Traces exposed Fastest ● GA Low-latency logic tasks at minimum cost
xAI
Grok Code Fast 1
xai.grok-code-fast-1 ◆ Coding Specialist 256K ✓ Summarized traces ✓ Specialized Very Fast ✓ Agentic Coding ● GA (Aug 2025) TypeScript, Python, Java, Rust, C++, Go; zero-to-one projects, bug fixes, agentic coding loops
Embedding Models
Model Name Model ID Generation Multimodal (Image) Language Scope Size Variant Use Case Status
Cohere
Embed 4
cohere.embed-v4.0 Gen 4 · Latest ✓ Text + Image (base64) Multilingual Full Latest multimodal embeddings; text & image semantic search ● Active
Cohere
Embed English Image 3
cohere.embed-english-image-v3.0 Gen 3 ✓ Text + Image English Full English-only text+image semantic search ● Active
Cohere
Embed English Light Image 3
cohere.embed-english-light-image-v3.0 Gen 3 ✓ Text + Image English Light Cost-efficient English text+image embeddings ● Active
Cohere
Embed Multilingual Image 3
cohere.embed-multilingual-image-v3.0 Gen 3 ✓ Text + Image Multilingual Full Global multilingual text+image semantic search ● Active
Cohere
Embed Multilingual Light Image 3
cohere.embed-multilingual-light-image-v3.0 Gen 3 ✓ Text + Image Multilingual Light Budget multilingual text+image embeddings ● Active
Cohere
Embed English 3
cohere.embed-english-v3.0 Gen 3 ✗ Text only English Full Pure text English semantic search, classification, clustering ● Active
Cohere
Embed English Light 3
cohere.embed-english-light-v3.0 Gen 3 ✗ Text only English Light Cost-efficient English text embeddings at scale ● Active
Cohere
Embed Multilingual 3
cohere.embed-multilingual-v3.0 Gen 3 ✗ Text only Multilingual Full Global enterprise text semantic search in 100+ languages ● Active
Cohere
Embed Multilingual Light 3
cohere.embed-multilingual-light-v3.0 Gen 3 ✗ Text only Multilingual Light Affordable multilingual text embeddings at volume ● Active
Rerank Model
Model Name Model ID Input Output Use Case Status
Cohere
Rerank 3.5
cohere.rerank.v3-5 Query + List of texts Ordered array with relevance scores RAG pipelines, document ranking, search result reordering, precision improvement ● Active
Use-Case Selection Guide

🔍 RAG / Document Search

🥇
Cohere Command A Reasoning
256K context, built for RAG, advanced reasoning over documents
🥈
Cohere Command R 08-2024
RAG-optimized, 128K context, fine-tunable, cost-efficient
🥉
Google Gemini 2.5 Pro
1M+ context — handle entire documents in one pass

🤖 Agentic / Tool-Use Workflows

🥇
xAI Grok 4.1 Fast
2M context, parallel tool calls, 3x lower hallucinations
🥈
Cohere Command A
256K context, best throughput for Cohere agentic tasks
🥉
OpenAI gpt-oss-120b
Advanced tool use, reasoning, OpenAI-compatible API

💻 Code Generation

🥇
xAI Grok Code Fast 1
Specialized coding model — plan, write, test, debug loop
🥈
Meta Llama 4 Maverick
MoE, strong coding + tool-calling capabilities
🥉
Google Gemini 2.5 Pro
Top-tier code reasoning, debugging, complex architecture

🌍 Multimodal (Text + Image)

🥇
Google Gemini 2.5 Pro
Best multimodal — text, image, code, audio, video
🥈
Cohere Command A Vision
Enterprise-focused image, chart, document understanding
🥉
Meta Llama 4 Maverick
Open-weight multimodal with MoE efficiency

⚡ High-Volume / Low-Latency

🥇
Google Gemini 2.5 Flash-Lite
Fastest + cheapest in Gemini family; 1M context
🥈
xAI Grok 3 Mini Fast
Lightweight thinker, lowest latency xAI model
🥉
OpenAI gpt-oss-20b
Consumer-grade hardware optimized, fast reasoning

🏢 Enterprise Fine-Tuning

🥇
Cohere Command R 08-2024
T-Few + Vanilla fine-tuning on dedicated AI clusters
🥈
Meta Llama 3.3 70B
LoRA fine-tuning, best 70B text performance

🌐 Multilingual Applications

🥇
Cohere Command A
Native multilingual support, 256K context, high-throughput production
🥈
Meta Llama 4 Scout / Maverick
Open-weight multilingual models with strong cross-language performance
🔤
Cohere Embed Multilingual 3 / Embed 4
Semantic search in 100+ languages; Gen 4 adds multimodal

📄 Long-context Document Analysis

🥇
xAI Grok 4.1 Fast
2M tokens — process full codebases or entire document archives in one pass
🥈
Google Gemini 2.5 Pro
1M+ context with multimodal; ideal for large PDFs, reports, mixed media
🥉
xAI Grok 4 Fast
2M context at optimized cost; great for batch document workloads
LEGEND
Flagship / Best-in-class
Balanced / Advanced
Speed / Efficiency tier
Lightweight / Budget
Specialized
2M/1M+ Context ≥ 1M tokens
256K Context = 192K–512K tokens
128K Context = 128K tokens
Feature supported
Not supported
MoE Mixture of Experts (sparse activation)

¹ Parameter counts are shown only when officially disclosed by the provider. Proprietary models (Google Gemini, xAI Grok) do not publish parameter counts and are omitted. "—" means not publicly disclosed.

² Fine-tuning on OCI uses dedicated AI clusters (GPU resources belonging exclusively to your tenancy). Cohere supports T-Few & Vanilla strategies; Meta Llama supports LoRA.

³ Retired/deprecated models (Command R legacy, Llama 3 70B, Llama 3.1 70B) are omitted from the main tables.

⁴ Model Import feature (GA 2025) lets you bring your own LLMs from Hugging Face or OCI Object Storage.

⁶ Grok 4 Fast and Grok 4.1 Fast each expose two OCI model IDs: a -reasoning variant (thinking tokens, chain-of-thought) and a -non-reasoning variant (instant responses, no thinking tokens).

⁵ Data source: OCI Official Documentation — docs.oracle.com/en-us/iaas/Content/generative-ai/ — March 2026.

⌥ GitHub Repository @ enricopesce