OCI · GENERATIVE AI

OCI GenAI Catalog

Source: OCI Official Documentation  |  Updated 17 May 2026  |  30+ models

Provider Context
No models match the selected filters.
5
Model Providers
27
Chat Models (Active)
9
Embedding Models
3
Rerank Model
82
Imported Models

Chat Models — Cohere Family

Model Model ID Tier Parameters Context Multimodal Reasoning Tool Use Fine-tunable Status Regions Best For
Cohere
Command A Reasoning
cohere.command-a-reasoning ★ Flagship Reasoning 111B 256K ✓ Advanced ● GA (Aug 2025) US-ASH US-CHI US-PHX SA-SAO EU-FRA UK-LON ME-RUH ME-DXB AP-HYD AP-OSA Complex Q&A, multi-step reasoning, document analysis, structured arguments
Cohere
Command A Vision
cohere.command-a-vision ★ Flagship Multimodal 112B 128K ✓ Images, Charts, Docs ● GA (Jul 2025) US-ASH US-CHI US-PHX SA-SAO EU-FRA UK-LON ME-RUH ME-DXB AP-HYD AP-OSA Enterprise document understanding with charts & images
Cohere
Command A
cohere.command-a-03-2025 ◉ Flagship Chat 111B 256K ✓ Advanced ● GA (Mar 2025) US-ASH US-CHI SA-SAO EU-FRA UK-LON ME-RUH ME-DXB AP-HYD AP-OSA Agentic enterprise tasks, RAG, multilingual, high-throughput production
Cohere
Command R+ 08-2024
cohere.command-r-plus-08-2024 ◉ Advanced 104B 128K ● Active US-ASH US-CHI SA-SAO EU-FRA UK-LON ME-RUH ME-DXB AP-OSA Complex specialized tasks, Q&A, sentiment, multilingual RAG
Cohere
Command R 08-2024
cohere.command-r-08-2024 ▷ Standard 35B 128K ✓ T-Few / Vanilla ● Active US-ASH US-CHI SA-SAO EU-FRA UK-LON ME-RUH AP-OSA RAG pipelines, info retrieval, cost-efficient enterprise chat

Chat Models — Google Gemini Family

Model Model ID Tier Parameters Context Multimodal Reasoning Tool Use Fine-tunable Status Regions Best For
Google
Gemini 2.5 Pro
google.gemini-2.5-pro ★ Flagship 1M ✓ Text, Image, Code, Audio, Video ✓ Advanced Reasoning ● GA US-ASH US-CHI US-PHX EU-FRA AP-OSA Most complex multimodal problems, large dataset analysis, SOTA reasoning tasks
Google
Gemini 2.5 Flash
google.gemini-2.5-flash ◉ Balanced 1M ✓ Text, Image, Code, Audio, Video ✓ Thinking features ● GA US-ASH US-CHI US-PHX EU-FRA AP-HYD AP-OSA Balanced workloads needing speed + intelligence, complex applications
Google
Gemini 2.5 Flash-Lite
google.gemini-2.5-flash-lite ○ Budget / Fast 1M ✓ Text, Image, Code, Audio, Video ● GA US-ASH US-CHI US-PHX EU-FRA High-volume, simpler tasks; cost-sensitive production workloads

Chat Models — Meta Llama Family

Model Model ID Tier Parameters Context Multimodal Reasoning Tool Use Fine-tunable Status Regions Best For
Meta
Llama 4 Maverick
meta.llama-4-maverick-17b-128e-instruct-fp8 ★ Flagship MoE ~400B 512K ✓ Text + Image ● GA (2025) US-CHI SA-SAO UK-LON ME-RUH AP-HYD AP-OSA Multimodal understanding, multilingual, coding, agentic systems, large-scale inference
Meta
Llama 4 Scout
meta.llama-4-scout-17b-16e-instruct ◉ Efficient MoE ~109B 192K ✓ Text + Image ● GA (2025) US-CHI SA-SAO UK-LON ME-RUH AP-HYD AP-OSA Smaller GPU deployments, efficient multimodal, multilingual, coding
Meta
Llama 3.3 70B
meta.llama-3.3-70b-instruct ◉ Best Text 70B 70B 128K ✓ LoRA ● GA US-CHI US-PHX SA-SAO EU-FRA UK-LON ME-RUH ME-DXB AP-HYD AP-OSA Best text-only 70B tasks; outperforms 3.1 70B and 3.2 90B on text benchmarks
Meta
Llama 3.2 90B Vision
meta.llama-3.2-90b-vision-instruct ◉ Vision Flagship 90B 128K ✓ Text + Image ● Active US-CHI SA-SAO UK-LON ME-RUH AP-OSA Multimodal understanding with large model capacity, image reasoning
Meta
Llama 3.2 11B Vision
meta.llama-3.2-11b-vision-instruct ▷ Compact Vision 11B 128K ✓ Text + Image ● Active (Dedicated only) US-CHI SA-SAO UK-LON AP-OSA Cost-efficient multimodal; resource-constrained deployments
Meta
Llama 3.1 405B
meta.llama-3.1-405b-instruct ★ Largest Open 405B 128K ● Active US-CHI SA-SAO EU-FRA UK-LON AP-OSA Highest text quality open model; complex reasoning, advanced generation

Chat Models — OpenAI gpt-oss Family

Model Model ID Tier Parameters Context Multimodal Reasoning Tool Use Fine-tunable Status Regions Best For
OpenAI
gpt-oss-120b
openai.gpt-oss-120b ★ Flagship OSS 120B 128K ✓ Advanced Reasoning + Agentic ✓ Advanced Tool Use ● GA US-ASH US-CHI US-PHX SA-SAO EU-FRA UK-LON ME-RUH ME-DXB AP-HYD AP-OSA Reasoning, agentic tasks; outperforms similar-size open models; OpenAI-compatible API
OpenAI
gpt-oss-20b
openai.gpt-oss-20b ▷ Efficient OSS 20B 128K ✓ Reasoning + Agentic ● GA US-ASH US-CHI US-PHX SA-SAO EU-FRA UK-LON ME-RUH ME-DXB AP-HYD AP-OSA Efficient consumer-hardware-optimized reasoning; agentic tasks at lower cost

Chat Models — xAI Grok Family

Model Model ID Tier Parameters Context Multimodal Reasoning Tool Use Fine-tunable Status Regions Best For
xAI
Grok 4
xai.grok-4 ★ Flagship 128K ✓ Text + Image ✓ Advanced ● GA US-ASH US-CHI US-PHX Advanced multimodal reasoning, enterprise data extraction, coding, summarization
xAI
Grok 4 Fast
xai.grok-4-fast-reasoning
xai.grok-4-fast-non-reasoning
▷ Fast Flagship 2M ✓ Text + Image ✓ Reasoning + Non-Reasoning modes ● GA US-ASH US-CHI US-PHX Same capability as Grok 4 with 2M context; cost-speed-optimized production
xAI
Grok 4.1 Fast
xai.grok-4-1-fast-reasoning
xai.grok-4-1-fast-non-reasoning
★ Agentic Flagship 2M ✓ Text + Image ✓ Reasoning + Non-Reasoning modes ✓ Parallel Tool Calling ● GA US-ASH US-CHI US-PHX Complex agentic systems, customer support, research with 2M multimodal context
xAI
Grok 4.20
xai.grok-4.20-reasoning
xai.grok-4.20-non-reasoning
xai.grok-4.20-0309-reasoning
xai.grok-4.20-0309-non-reasoning
★ Latest Flagship 2M ✓ Text + Image ✓ Reasoning + Non-Reasoning Variants ✓ Advanced Agentic ● GA (Mar 2026) US-ASH US-CHI US-PHX Latest-gen multimodal agentic reasoning with 2M context and dual reasoning modes
xAI
Grok 4.20 Multi-Agent
xai.grok-4.20-multi-agent
xai.grok-4.20-multi-agent-0309
◆ Multi-Agent Research 2M ✓ Text + Image ✓ Orchestrated multi-agent reasoning ✓ Multi-Agent Orchestration ● GA (Mar 2026) US-ASH US-CHI US-PHX Real-time multi-agent research — parallel web search, data analysis & synthesis by specialized sub-agents
xAI
Grok 3
xai.grok-3 ◉ Standard 131K ● GA US-ASH US-CHI US-PHX General enterprise tasks, data extraction, text summarization
xAI
Grok 3 Fast
xai.grok-3-fast ▷ Standard Fast 131K ● GA US-ASH US-CHI US-PHX High-throughput enterprise tasks at Grok 3 quality level
xAI
Grok 3 Mini
xai.grok-3-mini ○ Lightweight Thinker 131K ✓ Traces exposed ● GA US-ASH US-CHI US-PHX Logic-based tasks not requiring deep domain knowledge; transparent thinking traces
xAI
Grok 3 Mini Fast
xai.grok-3-mini-fast ○ Lightweight Fast 131K ✓ Traces exposed ● GA US-ASH US-CHI US-PHX Low-latency logic tasks at minimum cost
xAI
Grok Code Fast 1
xai.grok-code-fast-1 ◆ Coding Specialist 256K ✓ Summarized traces ✓ Agentic Coding ● GA (Aug 2025) US-ASH US-CHI US-PHX TypeScript, Python, Java, Rust, C++, Go; zero-to-one projects, bug fixes, agentic coding loops

Embedding Models

Model Name Model ID Generation Multimodal (Image) Language Scope Size Variant Use Case Status Regions
Cohere
Embed 4
cohere.embed-v4.0 Gen 4 · Latest ✓ Text + Image (base64) Multilingual Full Latest multimodal embeddings; text & image semantic search ● Active US-ASH US-CHI SA-SAO EU-FRA UK-LON ME-RUH ME-DXB AP-HYD AP-OSA
Cohere
Embed English Image 3
cohere.embed-english-image-v3.0 Gen 3 ✓ Text + Image English Full English-only text+image semantic search ● Active US-ASH US-CHI SA-SAO EU-FRA UK-LON ME-DXB AP-OSA
Cohere
Embed English Light Image 3
cohere.embed-english-light-image-v3.0 Gen 3 ✓ Text + Image English Light Cost-efficient English text+image embeddings ● Active US-ASH US-CHI SA-SAO EU-FRA UK-LON ME-DXB AP-OSA
Cohere
Embed Multilingual Image 3
cohere.embed-multilingual-image-v3.0 Gen 3 ✓ Text + Image Multilingual Full Global multilingual text+image semantic search ● Active US-ASH US-CHI SA-SAO EU-FRA UK-LON ME-DXB AP-HYD AP-OSA
Cohere
Embed Multilingual Light Image 3
cohere.embed-multilingual-light-image-v3.0 Gen 3 ✓ Text + Image Multilingual Light Budget multilingual text+image embeddings ● Active US-ASH US-CHI SA-SAO EU-FRA UK-LON ME-DXB AP-OSA
Cohere
Embed English 3
cohere.embed-english-v3.0 Gen 3 ✗ Text only English Full Pure text English semantic search, classification, clustering ● Active US-CHI SA-SAO EU-FRA UK-LON
Cohere
Embed English Light 3
cohere.embed-english-light-v3.0 Gen 3 ✗ Text only English Light Cost-efficient English text embeddings at scale ● Active US-CHI SA-SAO
Cohere
Embed Multilingual 3
cohere.embed-multilingual-v3.0 Gen 3 ✗ Text only Multilingual Full Global enterprise text semantic search in 100+ languages ● Active US-ASH US-CHI US-PHX SA-SAO EU-FRA UK-LON AP-HYD AP-OSA
Cohere
Embed Multilingual Light 3
cohere.embed-multilingual-light-v3.0 Gen 3 ✗ Text only Multilingual Light Affordable multilingual text embeddings at volume ● Active US-CHI SA-SAO

Rerank Model

Model Name Model ID Input Output Use Case Status Regions
Cohere
Rerank 3.5
cohere.rerank.v3-5 Query + List of texts Ordered array with relevance scores RAG pipelines, document ranking, search result reordering, precision improvement ● Active US-ASH US-CHI SA-SAO EU-FRA UK-LON ME-RUH AP-OSA

Imported Models

Available via OCI Generative AI Model Import — import open-weights models from HuggingFace into your own dedicated GPU cluster endpoint. 8 provider families · 82 models · supports fine-tuned variants within ±10% parameter count.

Alibaba Qwen Family

Model NameHuggingFace Model IDTypeParamsContextCluster Shape
Alibaba
QwQ-32B
Qwen/QwQ-32B Reasoning 32B 128K A100_80G_X2
Alibaba
Qwen Image
Qwen/Qwen-Image Image Gen A100_80G_X1
Alibaba
Qwen Image Edit
Qwen/Qwen-Image-Edit Image Gen A100_80G_X1
Alibaba
Qwen Image 2512
Qwen/Qwen-Image-2512 Image Gen A100_80G_X1
Alibaba
Qwen Image Edit 2511
Qwen/Qwen-Image-Edit-2511 Image Gen A100_80G_X1
Alibaba
Qwen Image Edit 2509
Qwen/Qwen-Image-Edit-2509 Image Gen A100_80G_X1
Alibaba
Qwen3-Embedding-0.6B
Qwen/Qwen3-Embedding-0.6B Embed 0.6B 32K A10_X1
Alibaba
Qwen3-Embedding-4B
Qwen/Qwen3-Embedding-4B Embed 4B 32K A10_X2
Alibaba
Qwen3-Embedding-8B
Qwen/Qwen3-Embedding-8B Embed 8B 32K A100_80G_X1
Alibaba
Qwen3-0.6B
Qwen/Qwen3-0.6B Chat 0.6B 32K A100_80G_X1
Alibaba
Qwen3-1.7B
Qwen/Qwen3-1.7B Chat 1.7B 32K A100_80G_X1
Alibaba
Qwen3-4B
Qwen/Qwen3-4B Chat 4B 32K A100_80G_X1
Alibaba
Qwen3-8B
Qwen/Qwen3-8B Chat 8B 32K A100_80G_X1
Alibaba
Qwen3-14B
Qwen/Qwen3-14B Chat 14B 32K A100_80G_X1
Alibaba
Qwen3-32B
Qwen/Qwen3-32B Chat 32B 32K A100_80G_X2
Alibaba
Qwen3-4B-Instruct-2507
Qwen/Qwen3-4B-Instruct-2507 Chat 4B 32K A100_80G_X1
Alibaba
Qwen3-30B-A3B-Instruct-2507
Qwen/Qwen3-30B-A3B-Instruct-2507 Chat 30B 3B active 32K A100_80G_X2
Alibaba
Qwen3-235B-A22B-Instruct-2507
Qwen/Qwen3-235B-A22B-Instruct-2507 Chat 235B 22B active 32K H100_X8
Alibaba
Qwen3-VL-30B-A3B-Instruct
Qwen/Qwen3-VL-30B-A3B-Instruct Vision 30B 3B active H100_X2
Alibaba
Qwen3-VL-235B-A22B-Instruct
Qwen/Qwen3-VL-235B-A22B-Instruct Vision 235B 22B active H100_X8
Alibaba
Qwen2.5-Coder-32B-Instruct
Qwen/Qwen2.5-Coder-32B-Instruct Coder 32B 128K A100_80G_X2
Alibaba
Qwen2.5-0.5B-Instruct
Qwen/Qwen2.5-0.5B-Instruct Chat 0.5B 128K A100_80G_X1
Alibaba
Qwen2.5-1.5B-Instruct
Qwen/Qwen2.5-1.5B-Instruct Chat 1.5B 128K A100_80G_X1
Alibaba
Qwen2.5-3B-Instruct
Qwen/Qwen2.5-3B-Instruct Chat 3B 128K A100_80G_X1
Alibaba
Qwen2.5-7B-Instruct
Qwen/Qwen2.5-7B-Instruct Chat 7B 128K A100_80G_X1
Alibaba
Qwen2.5-14B-Instruct
Qwen/Qwen2.5-14B-Instruct Chat 14B 128K A100_80G_X1
Alibaba
Qwen2.5-32B-Instruct
Qwen/Qwen2.5-32B-Instruct Chat 32B 128K A100_80G_X2
Alibaba
Qwen2.5-72B-Instruct
Qwen/Qwen2.5-72B-Instruct Chat 72B 128K A100_80G_X4
Alibaba
Qwen2.5-VL-3B-Instruct
Qwen/Qwen2.5-VL-3B-Instruct Vision 3B A100_80G_X1
Alibaba
Qwen2.5-VL-7B-Instruct
Qwen/Qwen2.5-VL-7B-Instruct Vision 7B A100_80G_X1
Alibaba
Qwen2.5-VL-32B-Instruct
Qwen/Qwen2.5-VL-32B-Instruct Vision 32B A100_80G_X2
Alibaba
Qwen2.5-VL-72B-Instruct
Qwen/Qwen2.5-VL-72B-Instruct Vision 72B A100_80G_X4
Alibaba
Qwen2-0.5B-Instruct
Qwen/Qwen2-0.5B-Instruct Chat 0.5B 32K A100_80G_X1
Alibaba
Qwen2-1.5B-Instruct
Qwen/Qwen2-1.5B-Instruct Chat 1.5B 32K A100_80G_X1
Alibaba
Qwen2-7B-Instruct
Qwen/Qwen2-7B-Instruct Chat 7B 128K A100_80G_X1
Alibaba
Qwen2-72B-Instruct
Qwen/Qwen2-72B-Instruct Chat 72B 128K A100_80G_X4
Alibaba
Qwen2-VL-2B-Instruct
Qwen/Qwen2-VL-2B-Instruct Vision 2B A100_80G_X1
Alibaba
Qwen2-VL-7B-Instruct
Qwen/Qwen2-VL-7B-Instruct Vision 7B A100_80G_X1
Alibaba
Qwen2-VL-72B-Instruct
Qwen/Qwen2-VL-72B-Instruct Vision 72B A100_80G_X4

DeepSeek Family

Model NameHuggingFace Model IDTypeParamsContextCluster Shape
DeepSeek
DeepSeek-R1-Distill-Qwen-32B
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B Reasoning 32B 128K A100_80G_X2

Google Gemma Family

Model NameHuggingFace Model IDTypeParamsContextCluster Shape
Gemma
Gemma 3 270M
google/gemma-3-270m-it Chat 270M 128K A100_80G_X1
Gemma
Gemma 3 1B
google/gemma-3-1b-it Chat 1B 128K A100_80G_X1
Gemma
Gemma 3 4B
google/gemma-3-4b-it Vision 4B 128K A100_80G_X1
Gemma
Gemma 3 12B
google/gemma-3-12b-it Vision 12B 128K A100_80G_X1
Gemma
Gemma 3 27B
google/gemma-3-27b-it Vision 27B 128K A100_80G_X2
Gemma
Gemma 2 2B
google/gemma-2-2b-it Chat 2B 8K A100_80G_X1
Gemma
Gemma 2 9B
google/gemma-2-9b-it Chat 9B 8K A100_80G_X1
Gemma
Gemma 2 27B
google/gemma-2-27b-it Chat 27B 8K A100_80G_X2

Meta Llama Family

Model NameHuggingFace Model IDTypeParamsContextCluster Shape
Meta
Llama 4 Maverick 17B
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 Vision 17B×128E 1M H100_X8
Meta
Llama 4 Scout 17B
meta-llama/Llama-4-Scout-17B-16E-Instruct Vision 17B×16E 1M H100_X4
Meta
Llama 3.3 70B Instruct
meta-llama/Llama-3.3-70B-Instruct Chat 70B 128K A100_80G_X4
Meta
Llama 3.2 3B Instruct
meta-llama/Llama-3.2-3B-Instruct Chat 3B 128K A100_80G_X1
Meta
Llama 3.2 1B Instruct
meta-llama/Llama-3.2-1B-Instruct Chat 1B 128K A100_80G_X1
Meta
Llama 3.1 8B Instruct
meta-llama/Llama-3.1-8B-Instruct Chat 8B 128K A100_80G_X1
Meta
Llama 3 8B Instruct
meta-llama/Meta-Llama-3-8B-Instruct Chat 8B 8K A100_80G_X1
Meta
Llama 3 70B Instruct
meta-llama/Meta-Llama-3-70B-Instruct Chat 70B 8K A100_80G_X4
Meta
Llama 2 70B Chat
meta-llama/Llama-2-70b-chat-hf Chat 70B 4K A100_80G_X4
Meta
Llama 2 13B Chat
meta-llama/Llama-2-13b-chat-hf Chat 13B 4K A100_80G_X1
Meta
Llama 2 7B Chat
meta-llama/Llama-2-7b-chat-hf Chat 7B 4K A100_80G_X1

Microsoft Phi Family

Model NameHuggingFace Model IDTypeParamsContextCluster Shape
Microsoft
Phi-4
microsoft/phi-4 Chat 14B 16K A100_80G_X1
Microsoft
Phi-3 Vision 128K
microsoft/Phi-3-vision-128k-instruct Vision 4.2B 128K H100_X1
Microsoft
Phi-3 Medium 128K
microsoft/Phi-3-medium-128k-instruct Chat 14B 128K A100_80G_X1
Microsoft
Phi-3 Medium 4K
microsoft/Phi-3-medium-4k-instruct Chat 14B 4K A100_80G_X1
Microsoft
Phi-3 Small 128K
microsoft/Phi-3-small-128k-instruct Chat 7B 128K A100_80G_X1
Microsoft
Phi-3 Small 8K
microsoft/Phi-3-small-8k-instruct Chat 7B 8K A100_80G_X1
Microsoft
Phi-3 Mini 128K
microsoft/Phi-3-mini-128k-instruct Chat 3.8B 128K A100_80G_X1
Microsoft
Phi-3 Mini 4K
microsoft/Phi-3-mini-4k-instruct Chat 3.8B 4K A100_80G_X1

Mistral Family

Model NameHuggingFace Model IDTypeParamsContextCluster Shape
Mistral
Mixtral 8x7B Instruct v0.1
mistralai/Mixtral-8x7B-Instruct-v0.1 Chat 8×7B MoE 32K A100_80G_X2
Mistral
Mistral Nemo Instruct 2407
mistralai/Mistral-Nemo-Instruct-2407 Chat 12B 128K A100_80G_X1
Mistral
Mistral 7B Instruct v0.3
mistralai/Mistral-7B-Instruct-v0.3 Chat 7B 32K A100_80G_X1
Mistral
Mistral 7B Instruct v0.2
mistralai/Mistral-7B-Instruct-v0.2 Chat 7B 32K A100_80G_X1
Mistral
Mistral 7B Instruct v0.1
mistralai/Mistral-7B-Instruct-v0.1 Chat 7B 8K A100_80G_X1
Mistral
E5 Mistral 7B Instruct
intfloat/e5-mistral-7b-instruct Embed 7B 32K A10_X1

NVIDIA Nemotron Family

Model NameHuggingFace Model IDTypeParamsContextCluster Shape
NVIDIA
Nemotron 3 Super 120B
nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16 Chat 120B 12B active 1M H100_X8
NVIDIA
Nemotron 3 Nano 30B (FP8)
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8 Chat 30B 3B active 1M H100_X4
NVIDIA
Nemotron 3 Nano 30B (BF16)
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 Chat 30B 3B active 1M A100_80G_X1
NVIDIA
Llama 3.1 Nemotron 70B Instruct
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF Chat 70B 128K A100_80G_X4

OpenAI GptOss Family

Model NameHuggingFace Model IDTypeParamsContextCluster Shape
OpenAI
GptOss 20B
openai/gpt-oss-20b Chat 20B 128K H100_X1
OpenAI
GptOss 120B
openai/gpt-oss-120b Chat 120B 128K H100_X2

Use-Case Selection Guide

🔍 RAG / Document Search

🥇
Cohere Command A Reasoning
256K context, built for RAG, advanced reasoning over documents
Dedicated only
🥈
Cohere Command R 08-2024
RAG-optimized, 128K context, fine-tunable, cost-efficient
On-demand + Ded.
🥉
Google Gemini 2.5 Pro
1M context — handle entire documents in one pass
On-demand (ext.)

🤖 Agentic / Tool-Use Workflows

🥇
xAI Grok 4.20 Multi-Agent
Real-time multi-agent research: parallel specialist orchestration for web search, analysis & synthesis
On-demand (ext.)
🥈
xAI Grok 4.1 Fast
2M context, parallel tool calls, 3× fewer hallucinations
On-demand (ext.)
🥉
Cohere Command A
256K context, best throughput for Cohere agentic tasks
On-demand + Ded.

💻 Code Generation

🥇
xAI Grok Code Fast 1
Specialized coding model — plan, write, test, debug loop
On-demand (ext.)
🥈
Meta Llama 4 Maverick
MoE, strong coding + tool-calling capabilities
On-demand + Ded.
🥉
Google Gemini 2.5 Pro
Top-tier code reasoning, debugging, complex architecture
On-demand (ext.)

🌍 Multimodal (Text + Image)

🥇
Google Gemini 2.5 Pro
Best multimodal — text, image, code, audio, video
On-demand (ext.)
🥈
Cohere Command A Vision
Enterprise-focused image, chart, document understanding
On-demand + Ded.
🥉
Meta Llama 4 Maverick
Open-weight multimodal with MoE efficiency
On-demand + Ded.

⚡ High-Volume / Low-Latency

🥇
Google Gemini 2.5 Flash-Lite
Fastest + cheapest in Gemini family; 1M context
On-demand (ext.)
🥈
xAI Grok 3 Mini Fast
Lightweight thinker, lowest latency xAI model
On-demand (ext.)
🥉
OpenAI gpt-oss-20b
Consumer-grade hardware optimized, fast reasoning
On-demand + Ded.

🏢 Enterprise Fine-Tuning

🥇
Cohere Command R 08-2024
T-Few + Vanilla fine-tuning on dedicated AI clusters
Dedicated only
🥈
Meta Llama 3.3 70B
LoRA fine-tuning, best 70B text performance
On-demand + Ded.

🌐 Multilingual Applications

🥇
Cohere Command A
Native multilingual support, 256K context, high-throughput production
On-demand + Ded.
🥈
Meta Llama 4 Scout / Maverick
Open-weight multilingual models with strong cross-language performance
On-demand + Ded.
🔤
Cohere Embed Multilingual 3 / Embed 4
Semantic search in 100+ languages; Gen 4 adds multimodal
On-demand + Ded.

📄 Long-context Document Analysis

🥇
xAI Grok 4.1 Fast
2M tokens — process full codebases or entire document archives in one pass
On-demand (ext.)
🥈
Google Gemini 2.5 Pro
1M context with multimodal; ideal for large PDFs, reports, mixed media
On-demand (ext.)
🥉
xAI Grok 4 Fast
2M context at optimized cost; great for batch document workloads
On-demand (ext.)

🇪🇺 EU Data Sovereignty (Frankfurt · London)

On-demand + Dedicated — EU-FRA & UK-LON
🥇
Cohere Command A
256K context, multilingual, RAG + agentic — widest EU access
EU-FRA + UK-LON
🥈
Meta Llama 3.3 70B
Open-weight, LoRA fine-tunable, strong text performance
EU-FRA + UK-LON
🥉
OpenAI gpt-oss-120b / 20b
OpenAI-compatible API, advanced reasoning — Frankfurt on-demand + dedicated
EU-FRA + UK-LON
Dedicated Only — EU-FRA & UK-LON
🔒
Cohere Command A Reasoning
Advanced reasoning over documents — tenancy-exclusive GPUs in both EU regions
Dedicated · EU-FRA + UK-LON
🔒
Meta Llama 4 Maverick / Scout
Latest Llama 4 multimodal models — UK-LON dedicated only
Dedicated · UK-LON
🔤
Cohere Embed 3 / 4 + Rerank 3.5
Semantic search & reranking with EU data residency
Dedicated · EU-FRA + UK-LON
On-demand only (ext.) — EU-FRA
🌐
Google Gemini 2.5 Pro / Flash / Flash-Lite
External API routed through EU-FRA endpoint — data does not reside on OCI hardware
On-demand (ext.) · EU-FRA
⚠️
xAI Grok — No EU Presence
All xAI Grok models are US-only (Ashburn, Chicago, Phoenix) — not suitable for EU data residency requirements
Not available in EU

🏗️ Dedicated AI Clusters

🔒
Required for Fine-Tuning
Fine-tuning jobs run exclusively on dedicated GPU clusters — Cohere T-Few/Vanilla and Meta LoRA cannot run on-demand
Dedicated onlyCohere Command RLlama 3.3 70B
🏢
Data Residency & Compliance
Tenancy-exclusive GPUs; your data never shares hardware — suited for regulated industries (GDPR, HIPAA, financial)
DedicatedAll CohereAll MetaOpenAI gpt-oss
⚠️
Not Available — Google & xAI
Google Gemini and xAI Grok route through external APIs — dedicated clusters are not supported for these providers
On-demand (ext.) only

⚡ On-demand (No Cluster Needed)

🚀
Google Gemini & xAI Grok — Always On-demand
External API call; no cluster provisioning, instant availability, pay-per-token billing
On-demand (ext.)
🌍
Cohere & OpenAI gpt-oss — On-demand in Select Regions
On-demand access available without a dedicated cluster — ideal for PoC and variable workloads
On-demand + Ded.
📝
Fine-Tuning Requires Dedicated
On-demand mode supports inference only — to fine-tune a model you must provision a dedicated AI cluster first
Dedicated required
LEGEND
Flagship / Best-in-class
Balanced / Advanced
Speed / Efficiency tier
Lightweight / Budget
Specialized
2M/1M Context ≥ 1M tokens
256K Context = 192K–512K tokens
128K Context = 128K tokens
Feature supported
Not supported
MoE Mixture of Experts (sparse activation)
US-CHI Region: on-demand + dedicated
EU-FRA Region: dedicated AI clusters only
US-ASH Region: on-demand / external call only

¹ Parameter counts are shown only when officially disclosed by the provider. Proprietary models (Google Gemini, xAI Grok) do not publish parameter counts and are omitted. "—" means not publicly disclosed.

² Fine-tuning on OCI uses dedicated AI clusters (GPU resources belonging exclusively to your tenancy). Cohere supports T-Few & Vanilla strategies; Meta Llama supports LoRA.

³ Retired models no longer listed in the current OCI pretrained-model catalog are omitted from the main tables; deprecated models still listed by OCI are shown with deprecated status.

⁴ Model Import feature (GA 2025) lets you bring your own LLMs from Hugging Face or OCI Object Storage.

⁶ OCI documents Grok 4 at 128K context, Grok 4.3 at 1M context, and Grok 4 Fast, Grok 4.1 Fast, Grok 4.20, and Grok 4.20 Multi-Agent at 2M context.

⁵ Data sources (OCI Official Documentation; catalog last updated 17 May 2026; OC1 commercial regions only): Pretrained Models · Models by Region · Inferencing Modes · Model Import

Unofficial reference — not an official Oracle document. This page was assembled from OCI public documentation. Data may contain errors or be out of date. Always verify against docs.oracle.com before making production decisions.
⌥ GitHub Repository { } models.json { } imported-models.json Enrico Pesce