OCI · GENERATIVE AI

OCI GenAI Catalog

Source: OCI Official Documentation  |  Updated 17 Apr 2026  |  30+ models

Provider Context
No models match the selected filters.
5
Model Providers
26
Chat Models (Active)
9
Embedding Models
1
Rerank Model
79
Imported Models
Chat Models — Cohere Family
Model Model ID Tier Parameters Context Multimodal Reasoning Tool Use Fine-tunable Status Regions Best For
Cohere
Command A Reasoning
cohere.command-a-reasoning ★ Flagship Reasoning 111B 256K ✓ Advanced ● GA (Aug 2025) US-ASH US-CHI US-PHX SA-SAO EU-FRA UK-LON ME-RUH ME-DXB AP-HYD AP-OSA Complex Q&A, multi-step reasoning, document analysis, structured arguments
Cohere
Command A Vision
cohere.command-a-vision ★ Flagship Multimodal 112B 128K ✓ Images, Charts, Docs ● GA (Jul 2025) US-ASH US-CHI US-PHX SA-SAO EU-FRA UK-LON ME-RUH ME-DXB AP-HYD AP-OSA Enterprise document understanding with charts & images
Cohere
Command A
cohere.command-a-03-2025 ◉ Flagship Chat 111B 256K ✓ Advanced ● GA (Mar 2025) US-ASH US-CHI SA-SAO EU-FRA UK-LON ME-RUH ME-DXB AP-HYD AP-OSA Agentic enterprise tasks, RAG, multilingual, high-throughput production
Cohere
Command R+ 08-2024
cohere.command-r-plus-08-2024 ◉ Advanced 104B 128K ● Active US-ASH US-CHI SA-SAO EU-FRA UK-LON ME-RUH ME-DXB AP-OSA Complex specialized tasks, Q&A, sentiment, multilingual RAG
Cohere
Command R 08-2024
cohere.command-r-08-2024 ▷ Standard 35B 128K ✓ T-Few / Vanilla ● Active US-ASH US-CHI SA-SAO EU-FRA UK-LON ME-RUH AP-OSA RAG pipelines, info retrieval, cost-efficient enterprise chat
Chat Models — Google Gemini Family
Model Model ID Tier Parameters Context Multimodal Reasoning Tool Use Fine-tunable Status Regions Best For
Google
Gemini 2.5 Pro
google.gemini-2.5-pro ★ Flagship 1M ✓ Text, Image, Code, Audio, Video ✓ Advanced Reasoning ● GA US-ASH US-CHI US-PHX EU-FRA AP-OSA Most complex multimodal problems, large dataset analysis, SOTA reasoning tasks
Google
Gemini 2.5 Flash
google.gemini-2.5-flash ◉ Balanced 1M ✓ Text, Image, Code, Audio, Video ✓ Thinking features ● GA US-ASH US-CHI US-PHX EU-FRA AP-HYD AP-OSA Balanced workloads needing speed + intelligence, complex applications
Google
Gemini 2.5 Flash-Lite
google.gemini-2.5-flash-lite ○ Budget / Fast 1M ✓ Text, Image, Code, Audio, Video ● GA US-ASH US-CHI US-PHX EU-FRA High-volume, simpler tasks; cost-sensitive production workloads
Chat Models — Meta Llama Family
Model Model ID Tier Parameters Context Multimodal Reasoning Tool Use Fine-tunable Status Regions Best For
Meta
Llama 4 Maverick
meta.llama-4-maverick-17b-128e-instruct-fp8 ★ Flagship MoE ~400B 512K ✓ Text + Image ● GA (2025) US-CHI SA-SAO UK-LON ME-RUH AP-HYD AP-OSA Multimodal understanding, multilingual, coding, agentic systems, large-scale inference
Meta
Llama 4 Scout
meta.llama-4-scout-17b-16e-instruct ◉ Efficient MoE ~109B 192K ✓ Text + Image ● GA (2025) US-CHI SA-SAO UK-LON ME-RUH AP-HYD AP-OSA Smaller GPU deployments, efficient multimodal, multilingual, coding
Meta
Llama 3.3 70B
meta.llama-3.3-70b-instruct ◉ Best Text 70B 70B 128K ✓ LoRA ● GA US-CHI US-PHX SA-SAO EU-FRA UK-LON ME-RUH ME-DXB AP-HYD AP-OSA Best text-only 70B tasks; outperforms 3.1 70B and 3.2 90B on text benchmarks
Meta
Llama 3.2 90B Vision
meta.llama-3.2-90b-vision-instruct ◉ Vision Flagship 90B 128K ✓ Text + Image ● Active US-CHI SA-SAO UK-LON ME-RUH AP-OSA Multimodal understanding with large model capacity, image reasoning
Meta
Llama 3.2 11B Vision
meta.llama-3.2-11b-vision-instruct ▷ Compact Vision 11B 128K ✓ Text + Image ● Active (Dedicated only) US-CHI SA-SAO UK-LON AP-OSA Cost-efficient multimodal; resource-constrained deployments
Meta
Llama 3.1 405B
meta.llama-3.1-405b-instruct ★ Largest Open 405B 128K ● Active US-CHI SA-SAO EU-FRA UK-LON AP-OSA Highest text quality open model; complex reasoning, advanced generation
Chat Models — OpenAI gpt-oss Family
Model Model ID Tier Parameters Context Multimodal Reasoning Tool Use Fine-tunable Status Regions Best For
OpenAI
gpt-oss-120b
openai.gpt-oss-120b ★ Flagship OSS 120B 128K ✓ Advanced Reasoning + Agentic ✓ Advanced Tool Use ● GA US-ASH US-CHI US-PHX SA-SAO EU-FRA UK-LON ME-RUH ME-DXB AP-HYD AP-OSA Reasoning, agentic tasks; outperforms similar-size open models; OpenAI-compatible API
OpenAI
gpt-oss-20b
openai.gpt-oss-20b ▷ Efficient OSS 20B 128K ✓ Reasoning + Agentic ● GA US-ASH US-CHI US-PHX SA-SAO EU-FRA UK-LON ME-RUH ME-DXB AP-HYD AP-OSA Efficient consumer-hardware-optimized reasoning; agentic tasks at lower cost
Chat Models — xAI Grok Family
Model Model ID Tier Parameters Context Multimodal Reasoning Tool Use Fine-tunable Status Regions Best For
xAI
Grok 4
xai.grok-4 ★ Flagship 128K ✓ Text + Image ✓ Advanced ● GA US-ASH US-CHI US-PHX Advanced multimodal reasoning, enterprise data extraction, coding, summarization
xAI
Grok 4 Fast
xai.grok-4-fast-reasoning
xai.grok-4-fast-non-reasoning
▷ Fast Flagship 2M ✓ Text + Image ✓ Reasoning + Non-Reasoning modes ● GA US-ASH US-CHI US-PHX Same capability as Grok 4 with 2M context; cost-speed-optimized production
xAI
Grok 4.1 Fast
xai.grok-4-1-fast-reasoning
xai.grok-4-1-fast-non-reasoning
★ Agentic Flagship 2M ✓ Text + Image ✓ Reasoning + Non-Reasoning modes ✓ Parallel Tool Calling ● GA US-ASH US-CHI US-PHX Complex agentic systems, customer support, research with 2M multimodal context
xAI
Grok 4.20
xai.grok-4.20-reasoning
xai.grok-4.20-non-reasoning
xai.grok-4.20-0309-reasoning
xai.grok-4.20-0309-non-reasoning
★ Latest Flagship 2M ✓ Text + Image ✓ Reasoning + Non-Reasoning Variants ✓ Advanced Agentic ● GA (Mar 2026) US-ASH US-CHI US-PHX Latest-gen multimodal agentic reasoning with 2M context and dual reasoning modes
xAI
Grok 4.20 Multi-Agent
xai.grok-4.20-multi-agent
xai.grok-4.20-multi-agent-0309
◆ Multi-Agent Research 2M ✓ Text + Image ✓ Orchestrated multi-agent reasoning ✓ Multi-Agent Orchestration ● GA (Mar 2026) US-ASH US-CHI US-PHX Real-time multi-agent research — parallel web search, data analysis & synthesis by specialized sub-agents
xAI
Grok 3
xai.grok-3 ◉ Standard 131K ● GA US-ASH US-CHI US-PHX General enterprise tasks, data extraction, text summarization
xAI
Grok 3 Fast
xai.grok-3-fast ▷ Standard Fast 131K ● GA US-ASH US-CHI US-PHX High-throughput enterprise tasks at Grok 3 quality level
xAI
Grok 3 Mini
xai.grok-3-mini ○ Lightweight Thinker 131K ✓ Traces exposed ● GA US-ASH US-CHI US-PHX Logic-based tasks not requiring deep domain knowledge; transparent thinking traces
xAI
Grok 3 Mini Fast
xai.grok-3-mini-fast ○ Lightweight Fast 131K ✓ Traces exposed ● GA US-ASH US-CHI US-PHX Low-latency logic tasks at minimum cost
xAI
Grok Code Fast 1
xai.grok-code-fast-1 ◆ Coding Specialist 256K ✓ Summarized traces ✓ Agentic Coding ● GA (Aug 2025) US-ASH US-CHI US-PHX TypeScript, Python, Java, Rust, C++, Go; zero-to-one projects, bug fixes, agentic coding loops
Embedding Models
Model Name Model ID Generation Multimodal (Image) Language Scope Size Variant Use Case Status Regions
Cohere
Embed 4
cohere.embed-v4.0 Gen 4 · Latest ✓ Text + Image (base64) Multilingual Full Latest multimodal embeddings; text & image semantic search ● Active US-ASH US-CHI SA-SAO EU-FRA UK-LON ME-RUH ME-DXB AP-HYD AP-OSA
Cohere
Embed English Image 3
cohere.embed-english-image-v3.0 Gen 3 ✓ Text + Image English Full English-only text+image semantic search ● Active US-ASH US-CHI SA-SAO EU-FRA UK-LON ME-DXB AP-OSA
Cohere
Embed English Light Image 3
cohere.embed-english-light-image-v3.0 Gen 3 ✓ Text + Image English Light Cost-efficient English text+image embeddings ● Active US-ASH US-CHI SA-SAO EU-FRA UK-LON ME-DXB AP-OSA
Cohere
Embed Multilingual Image 3
cohere.embed-multilingual-image-v3.0 Gen 3 ✓ Text + Image Multilingual Full Global multilingual text+image semantic search ● Active US-ASH US-CHI SA-SAO EU-FRA UK-LON ME-DXB AP-HYD AP-OSA
Cohere
Embed Multilingual Light Image 3
cohere.embed-multilingual-light-image-v3.0 Gen 3 ✓ Text + Image Multilingual Light Budget multilingual text+image embeddings ● Active US-ASH US-CHI SA-SAO EU-FRA UK-LON ME-DXB AP-OSA
Cohere
Embed English 3
cohere.embed-english-v3.0 Gen 3 ✗ Text only English Full Pure text English semantic search, classification, clustering ● Active US-CHI SA-SAO EU-FRA UK-LON
Cohere
Embed English Light 3
cohere.embed-english-light-v3.0 Gen 3 ✗ Text only English Light Cost-efficient English text embeddings at scale ● Active US-CHI SA-SAO
Cohere
Embed Multilingual 3
cohere.embed-multilingual-v3.0 Gen 3 ✗ Text only Multilingual Full Global enterprise text semantic search in 100+ languages ● Active US-ASH US-CHI US-PHX SA-SAO EU-FRA UK-LON AP-HYD AP-OSA
Cohere
Embed Multilingual Light 3
cohere.embed-multilingual-light-v3.0 Gen 3 ✗ Text only Multilingual Light Affordable multilingual text embeddings at volume ● Active US-CHI SA-SAO
Rerank Model
Model Name Model ID Input Output Use Case Status Regions
Cohere
Rerank 3.5
cohere.rerank.v3-5 Query + List of texts Ordered array with relevance scores RAG pipelines, document ranking, search result reordering, precision improvement ● Active US-ASH US-CHI SA-SAO EU-FRA UK-LON ME-RUH AP-OSA
Imported Models

Available via OCI Generative AI Model Import — import open-weights models from HuggingFace into your own dedicated GPU cluster endpoint. 8 provider families · 79 models · supports fine-tuned variants within ±10% parameter count.

Alibaba Qwen Family
Model NameHuggingFace Model IDTypeParamsContextCluster Shape
Alibaba
QwQ-32B
Qwen/QwQ-32B Reasoning 32B 128K A100_80G_X2
Alibaba
Qwen Image
Qwen/Qwen-Image Image Gen A100_80G_X1
Alibaba
Qwen Image Edit
Qwen/Qwen-Image-Edit Image Gen A100_80G_X1
Alibaba
Qwen Image 2512
Qwen/Qwen-Image-2512 Image Gen A100_80G_X1
Alibaba
Qwen Image Edit 2511
Qwen/Qwen-Image-Edit-2511 Image Gen A100_80G_X1
Alibaba
Qwen Image Edit 2509
Qwen/Qwen-Image-Edit-2509 Image Gen A100_80G_X1
Alibaba
Qwen3-Embedding-0.6B
Qwen/Qwen3-Embedding-0.6B Embed 0.6B 32K A10_X1
Alibaba
Qwen3-Embedding-4B
Qwen/Qwen3-Embedding-4B Embed 4B 32K A10_X2
Alibaba
Qwen3-Embedding-8B
Qwen/Qwen3-Embedding-8B Embed 8B 32K A100_80G_X1
Alibaba
Qwen3-0.6B
Qwen/Qwen3-0.6B Chat 0.6B 32K A100_80G_X1
Alibaba
Qwen3-1.7B
Qwen/Qwen3-1.7B Chat 1.7B 32K A100_80G_X1
Alibaba
Qwen3-4B
Qwen/Qwen3-4B Chat 4B 32K A100_80G_X1
Alibaba
Qwen3-8B
Qwen/Qwen3-8B Chat 8B 32K A100_80G_X1
Alibaba
Qwen3-14B
Qwen/Qwen3-14B Chat 14B 32K A100_80G_X1
Alibaba
Qwen3-32B
Qwen/Qwen3-32B Chat 32B 32K A100_80G_X2
Alibaba
Qwen3-4B-Instruct-2507
Qwen/Qwen3-4B-Instruct-2507 Chat 4B 32K A100_80G_X1
Alibaba
Qwen3-30B-A3B-Instruct-2507
Qwen/Qwen3-30B-A3B-Instruct-2507 Chat 30B 3B active 32K A100_80G_X2
Alibaba
Qwen3-235B-A22B-Instruct-2507
Qwen/Qwen3-235B-A22B-Instruct-2507 Chat 235B 22B active 32K H100_X8
Alibaba
Qwen3-VL-30B-A3B-Instruct
Qwen/Qwen3-VL-30B-A3B-Instruct Vision 30B 3B active H100_X2
Alibaba
Qwen3-VL-235B-A22B-Instruct
Qwen/Qwen3-VL-235B-A22B-Instruct Vision 235B 22B active H100_X8
Alibaba
Qwen2.5-Coder-32B-Instruct
Qwen/Qwen2.5-Coder-32B-Instruct Coder 32B 128K A100_80G_X2
Alibaba
Qwen2.5-0.5B-Instruct
Qwen/Qwen2.5-0.5B-Instruct Chat 0.5B 128K A100_80G_X1
Alibaba
Qwen2.5-1.5B-Instruct
Qwen/Qwen2.5-1.5B-Instruct Chat 1.5B 128K A100_80G_X1
Alibaba
Qwen2.5-3B-Instruct
Qwen/Qwen2.5-3B-Instruct Chat 3B 128K A100_80G_X1
Alibaba
Qwen2.5-7B-Instruct
Qwen/Qwen2.5-7B-Instruct Chat 7B 128K A100_80G_X1
Alibaba
Qwen2.5-14B-Instruct
Qwen/Qwen2.5-14B-Instruct Chat 14B 128K A100_80G_X1
Alibaba
Qwen2.5-32B-Instruct
Qwen/Qwen2.5-32B-Instruct Chat 32B 128K A100_80G_X2
Alibaba
Qwen2.5-72B-Instruct
Qwen/Qwen2.5-72B-Instruct Chat 72B 128K A100_80G_X4
Alibaba
Qwen2.5-VL-3B-Instruct
Qwen/Qwen2.5-VL-3B-Instruct Vision 3B A100_80G_X1
Alibaba
Qwen2.5-VL-7B-Instruct
Qwen/Qwen2.5-VL-7B-Instruct Vision 7B A100_80G_X1
Alibaba
Qwen2.5-VL-32B-Instruct
Qwen/Qwen2.5-VL-32B-Instruct Vision 32B A100_80G_X2
Alibaba
Qwen2.5-VL-72B-Instruct
Qwen/Qwen2.5-VL-72B-Instruct Vision 72B A100_80G_X4
Alibaba
Qwen2-0.5B-Instruct
Qwen/Qwen2-0.5B-Instruct Chat 0.5B 32K A100_80G_X1
Alibaba
Qwen2-1.5B-Instruct
Qwen/Qwen2-1.5B-Instruct Chat 1.5B 32K A100_80G_X1
Alibaba
Qwen2-7B-Instruct
Qwen/Qwen2-7B-Instruct Chat 7B 128K A100_80G_X1
Alibaba
Qwen2-72B-Instruct
Qwen/Qwen2-72B-Instruct Chat 72B 128K A100_80G_X4
Alibaba
Qwen2-VL-2B-Instruct
Qwen/Qwen2-VL-2B-Instruct Vision 2B A100_80G_X1
Alibaba
Qwen2-VL-7B-Instruct
Qwen/Qwen2-VL-7B-Instruct Vision 7B A100_80G_X1
Alibaba
Qwen2-VL-72B-Instruct
Qwen/Qwen2-VL-72B-Instruct Vision 72B A100_80G_X4
DeepSeek Family
Model NameHuggingFace Model IDTypeParamsContextCluster Shape
DeepSeek
DeepSeek-R1-Distill-Qwen-32B
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B Reasoning 32B 128K A100_80G_X2
Google Gemma Family
Model NameHuggingFace Model IDTypeParamsContextCluster Shape
Gemma
Gemma 3 270M
google/gemma-3-270m-it Chat 270M 128K A100_80G_X1
Gemma
Gemma 3 1B
google/gemma-3-1b-it Chat 1B 128K A100_80G_X1
Gemma
Gemma 3 4B
google/gemma-3-4b-it Vision 4B 128K A100_80G_X1
Gemma
Gemma 3 12B
google/gemma-3-12b-it Vision 12B 128K A100_80G_X1
Gemma
Gemma 3 27B
google/gemma-3-27b-it Vision 27B 128K A100_80G_X2
Gemma
Gemma 2 2B
google/gemma-2-2b-it Chat 2B 8K A100_80G_X1
Gemma
Gemma 2 9B
google/gemma-2-9b-it Chat 9B 8K A100_80G_X1
Gemma
Gemma 2 27B
google/gemma-2-27b-it Chat 27B 8K A100_80G_X2
Meta Llama Family
Model NameHuggingFace Model IDTypeParamsContextCluster Shape
Meta
Llama 4 Maverick 17B
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 Vision 17B×128E 1M H100_X8
Meta
Llama 4 Scout 17B
meta-llama/Llama-4-Scout-17B-16E-Instruct Vision 17B×16E 1M H100_X4
Meta
Llama 3.3 70B Instruct
meta-llama/Llama-3.3-70B-Instruct Chat 70B 128K A100_80G_X4
Meta
Llama 3.2 3B Instruct
meta-llama/Llama-3.2-3B-Instruct Chat 3B 128K A100_80G_X1
Meta
Llama 3.2 1B Instruct
meta-llama/Llama-3.2-1B-Instruct Chat 1B 128K A100_80G_X1
Meta
Llama 3.1 8B Instruct
meta-llama/Llama-3.1-8B-Instruct Chat 8B 128K A100_80G_X1
Meta
Llama 3 8B Instruct
meta-llama/Meta-Llama-3-8B-Instruct Chat 8B 8K A100_80G_X1
Meta
Llama 3 70B Instruct
meta-llama/Meta-Llama-3-70B-Instruct Chat 70B 8K A100_80G_X4
Meta
Llama 2 70B Chat
meta-llama/Llama-2-70b-chat-hf Chat 70B 4K A100_80G_X4
Meta
Llama 2 13B Chat
meta-llama/Llama-2-13b-chat-hf Chat 13B 4K A100_80G_X1
Meta
Llama 2 7B Chat
meta-llama/Llama-2-7b-chat-hf Chat 7B 4K A100_80G_X1
Microsoft Phi Family
Model NameHuggingFace Model IDTypeParamsContextCluster Shape
Microsoft
Phi-4
microsoft/phi-4 Chat 14B 16K A100_80G_X1
Microsoft
Phi-3 Vision 128K
microsoft/Phi-3-vision-128k-instruct Vision 4.2B 128K H100_X1
Microsoft
Phi-3 Medium 128K
microsoft/Phi-3-medium-128k-instruct Chat 14B 128K A100_80G_X1
Microsoft
Phi-3 Medium 4K
microsoft/Phi-3-medium-4k-instruct Chat 14B 4K A100_80G_X1
Microsoft
Phi-3 Small 128K
microsoft/Phi-3-small-128k-instruct Chat 7B 128K A100_80G_X1
Microsoft
Phi-3 Small 8K
microsoft/Phi-3-small-8k-instruct Chat 7B 8K A100_80G_X1
Microsoft
Phi-3 Mini 128K
microsoft/Phi-3-mini-128k-instruct Chat 3.8B 128K A100_80G_X1
Microsoft
Phi-3 Mini 4K
microsoft/Phi-3-mini-4k-instruct Chat 3.8B 4K A100_80G_X1
Mistral Family
Model NameHuggingFace Model IDTypeParamsContextCluster Shape
Mistral
Mixtral 8x7B Instruct v0.1
mistralai/Mixtral-8x7B-Instruct-v0.1 Chat 8×7B MoE 32K A100_80G_X2
Mistral
Mistral Nemo Instruct 2407
mistralai/Mistral-Nemo-Instruct-2407 Chat 12B 128K A100_80G_X1
Mistral
Mistral 7B Instruct v0.3
mistralai/Mistral-7B-Instruct-v0.3 Chat 7B 32K A100_80G_X1
Mistral
Mistral 7B Instruct v0.2
mistralai/Mistral-7B-Instruct-v0.2 Chat 7B 32K A100_80G_X1
Mistral
Mistral 7B Instruct v0.1
mistralai/Mistral-7B-Instruct-v0.1 Chat 7B 8K A100_80G_X1
Mistral
E5 Mistral 7B Instruct
intfloat/e5-mistral-7b-instruct Embed 7B 32K A10_X1
NVIDIA Nemotron Family
Model NameHuggingFace Model IDTypeParamsContextCluster Shape
NVIDIA
Nemotron 3 Super 120B
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 Chat 120B 12B active 1M H100_X8
NVIDIA
Nemotron 3 Nano 30B (FP8)
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8 Chat 30B 3B active 1M H100_X4
NVIDIA
Nemotron 3 Nano 30B (BF16)
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 Chat 30B 3B active 1M A100_80G_X1
NVIDIA
Llama 3.1 Nemotron 70B Instruct
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF Chat 70B 128K A100_80G_X4
OpenAI GptOss Family
Model NameHuggingFace Model IDTypeParamsContextCluster Shape
OpenAI
GptOss 20B
openai/gpt-oss-20b Chat 20B 128K H100_X1
OpenAI
GptOss 120B
openai/gpt-oss-120b Chat 120B 128K H100_X2
Use-Case Selection Guide

🔍 RAG / Document Search

🥇
Cohere Command A Reasoning
256K context, built for RAG, advanced reasoning over documents
Dedicated only
🥈
Cohere Command R 08-2024
RAG-optimized, 128K context, fine-tunable, cost-efficient
On-demand + Ded.
🥉
Google Gemini 2.5 Pro
1M context — handle entire documents in one pass
On-demand (ext.)

🤖 Agentic / Tool-Use Workflows

🥇
xAI Grok 4.20 Multi-Agent
Real-time multi-agent research: parallel specialist orchestration for web search, analysis & synthesis
On-demand (ext.)
🥈
xAI Grok 4.1 Fast
2M context, parallel tool calls, 3× fewer hallucinations
On-demand (ext.)
🥉
Cohere Command A
256K context, best throughput for Cohere agentic tasks
On-demand + Ded.

💻 Code Generation

🥇
xAI Grok Code Fast 1
Specialized coding model — plan, write, test, debug loop
On-demand (ext.)
🥈
Meta Llama 4 Maverick
MoE, strong coding + tool-calling capabilities
On-demand + Ded.
🥉
Google Gemini 2.5 Pro
Top-tier code reasoning, debugging, complex architecture
On-demand (ext.)

🌍 Multimodal (Text + Image)

🥇
Google Gemini 2.5 Pro
Best multimodal — text, image, code, audio, video
On-demand (ext.)
🥈
Cohere Command A Vision
Enterprise-focused image, chart, document understanding
On-demand + Ded.
🥉
Meta Llama 4 Maverick
Open-weight multimodal with MoE efficiency
On-demand + Ded.

⚡ High-Volume / Low-Latency

🥇
Google Gemini 2.5 Flash-Lite
Fastest + cheapest in Gemini family; 1M context
On-demand (ext.)
🥈
xAI Grok 3 Mini Fast
Lightweight thinker, lowest latency xAI model
On-demand (ext.)
🥉
OpenAI gpt-oss-20b
Consumer-grade hardware optimized, fast reasoning
On-demand + Ded.

🏢 Enterprise Fine-Tuning

🥇
Cohere Command R 08-2024
T-Few + Vanilla fine-tuning on dedicated AI clusters
Dedicated only
🥈
Meta Llama 3.3 70B
LoRA fine-tuning, best 70B text performance
On-demand + Ded.

🌐 Multilingual Applications

🥇
Cohere Command A
Native multilingual support, 256K context, high-throughput production
On-demand + Ded.
🥈
Meta Llama 4 Scout / Maverick
Open-weight multilingual models with strong cross-language performance
On-demand + Ded.
🔤
Cohere Embed Multilingual 3 / Embed 4
Semantic search in 100+ languages; Gen 4 adds multimodal
On-demand + Ded.

📄 Long-context Document Analysis

🥇
xAI Grok 4.1 Fast
2M tokens — process full codebases or entire document archives in one pass
On-demand (ext.)
🥈
Google Gemini 2.5 Pro
1M context with multimodal; ideal for large PDFs, reports, mixed media
On-demand (ext.)
🥉
xAI Grok 4 Fast
2M context at optimized cost; great for batch document workloads
On-demand (ext.)

🇪🇺 EU Data Sovereignty (Frankfurt · London)

On-demand + Dedicated — EU-FRA & UK-LON
🥇
Cohere Command A
256K context, multilingual, RAG + agentic — widest EU access
EU-FRA + UK-LON
🥈
Meta Llama 3.3 70B
Open-weight, LoRA fine-tunable, strong text performance
EU-FRA + UK-LON
🥉
OpenAI gpt-oss-120b / 20b
OpenAI-compatible API, advanced reasoning — Frankfurt on-demand + dedicated
EU-FRA + UK-LON
Dedicated Only — EU-FRA & UK-LON
🔒
Cohere Command A Reasoning
Advanced reasoning over documents — tenancy-exclusive GPUs in both EU regions
Dedicated · EU-FRA + UK-LON
🔒
Meta Llama 4 Maverick / Scout
Latest Llama 4 multimodal models — UK-LON dedicated only
Dedicated · UK-LON
🔤
Cohere Embed 3 / 4 + Rerank 3.5
Semantic search & reranking with EU data residency
Dedicated · EU-FRA + UK-LON
On-demand only (ext.) — EU-FRA
🌐
Google Gemini 2.5 Pro / Flash / Flash-Lite
External API routed through EU-FRA endpoint — data does not reside on OCI hardware
On-demand (ext.) · EU-FRA
⚠️
xAI Grok — No EU Presence
All xAI Grok models are US-only (Ashburn, Chicago, Phoenix) — not suitable for EU data residency requirements
Not available in EU

🏗️ Dedicated AI Clusters

🔒
Required for Fine-Tuning
Fine-tuning jobs run exclusively on dedicated GPU clusters — Cohere T-Few/Vanilla and Meta LoRA cannot run on-demand
Dedicated onlyCohere Command RLlama 3.3 70B
🏢
Data Residency & Compliance
Tenancy-exclusive GPUs; your data never shares hardware — suited for regulated industries (GDPR, HIPAA, financial)
DedicatedAll CohereAll MetaOpenAI gpt-oss
⚠️
Not Available — Google & xAI
Google Gemini and xAI Grok route through external APIs — dedicated clusters are not supported for these providers
On-demand (ext.) only

⚡ On-demand (No Cluster Needed)

🚀
Google Gemini & xAI Grok — Always On-demand
External API call; no cluster provisioning, instant availability, pay-per-token billing
On-demand (ext.)
🌍
Cohere & OpenAI gpt-oss — On-demand in Select Regions
On-demand access available without a dedicated cluster — ideal for PoC and variable workloads
On-demand + Ded.
📝
Fine-Tuning Requires Dedicated
On-demand mode supports inference only — to fine-tune a model you must provision a dedicated AI cluster first
Dedicated required
LEGEND
Flagship / Best-in-class
Balanced / Advanced
Speed / Efficiency tier
Lightweight / Budget
Specialized
2M/1M Context ≥ 1M tokens
256K Context = 192K–512K tokens
128K Context = 128K tokens
Feature supported
Not supported
MoE Mixture of Experts (sparse activation)
US-CHI Region: on-demand + dedicated
EU-FRA Region: dedicated AI clusters only
US-ASH Region: on-demand / external call only

¹ Parameter counts are shown only when officially disclosed by the provider. Proprietary models (Google Gemini, xAI Grok) do not publish parameter counts and are omitted. "—" means not publicly disclosed.

² Fine-tuning on OCI uses dedicated AI clusters (GPU resources belonging exclusively to your tenancy). Cohere supports T-Few & Vanilla strategies; Meta Llama supports LoRA.

³ Retired/deprecated models (Command R 16K, Command R+, Llama 3 70B, Llama 3.1 70B) are omitted from the main tables.

⁴ Model Import feature (GA 2025) lets you bring your own LLMs from Hugging Face or OCI Object Storage.

⁶ OCI documents Grok 4 at 128K context and Grok 4 Fast, Grok 4.1 Fast, Grok 4.20, and Grok 4.20 Multi-Agent at 2M context.

⁵ Data sources (OCI Official Documentation; catalog last updated 17 April 2026; OC1 commercial regions only): Pretrained Models · Models by Region · Inferencing Modes · Model Import

AI-generated content — not an official Oracle document. This page was assembled with AI assistance from OCI public documentation. Data may contain errors or be out of date. Always verify against docs.oracle.com before making production decisions.
⌥ GitHub Repository { } models.json { } imported-models.json @ enricopesce