OCI · GENERATIVE AI

OCI GenAI Catalog

Source: OCI Official Documentation | Updated 13 Mar 2026 | 30+ models

Task › Tier › Deployment › Region

Step 1 — What are you building?

Provider Context

No models match the selected filters.

5

Model Providers

24

Chat Models (Active)

9

Embedding Models

1

Rerank Model

Chat Models — Cohere Family

Model Name	Model ID	Tier	Parameters	Context Window	Multimodal	Fine-tunable	Tool Use / Agents	Reasoning	RAG Optimized	Status	Regions	Best For

Chat Models — Google Gemini Family

Model Name	Model ID	Tier	Context Window	Multimodal	Thinking / Reasoning	Speed Profile	Fine-tunable on OCI	Status	Regions	Best For

Chat Models — Meta Llama Family

Model Name	Model ID	Tier	Architecture	Total Params	Active Params (MoE)	Context Window	Multimodal	Fine-tunable (LoRA)	Tool Use	Status	Regions	Best For

Chat Models — OpenAI gpt-oss Family

Model Name	Model ID	Tier	Parameters	Context Window	Reasoning / Agentic	Tool Use	Open Source	Fine-tunable on OCI	Status	Regions	Best For

Chat Models — xAI Grok Family

Model Name	Model ID	Tier	Context Window	Multimodal	Thinking / Chain-of-Thought	Coding Focus	Speed Profile	Domain Knowledge	Agentic	Status	Regions	Best For

Embedding Models

Model Name	Model ID	Generation	Multimodal (Image)	Language Scope	Size Variant	Use Case	Status	Regions

Rerank Model

Model Name	Model ID	Input	Output	Use Case	Status	Regions

Use-Case Selection Guide

🔍 RAG / Document Search

🥇

Cohere Command A Reasoning

256K context, built for RAG, advanced reasoning over documents

Dedicated only

🥈

Cohere Command R 08-2024

RAG-optimized, 128K context, fine-tunable, cost-efficient

On-demand + Ded.

🥉

Google Gemini 2.5 Pro

1M+ context — handle entire documents in one pass

On-demand (ext.)

🤖 Agentic / Tool-Use Workflows

🥇

xAI Grok 4.1 Fast

2M context, parallel tool calls, 3x lower hallucinations

On-demand (ext.)

🥈

Cohere Command A

256K context, best throughput for Cohere agentic tasks

On-demand + Ded.

🥉

OpenAI gpt-oss-120b

Advanced tool use, reasoning, OpenAI-compatible API

On-demand + Ded.

💻 Code Generation

🥇

xAI Grok Code Fast 1

Specialized coding model — plan, write, test, debug loop

On-demand (ext.)

🥈

Meta Llama 4 Maverick

MoE, strong coding + tool-calling capabilities

On-demand + Ded.

🥉

Google Gemini 2.5 Pro

Top-tier code reasoning, debugging, complex architecture

On-demand (ext.)

🌍 Multimodal (Text + Image)

🥇

Google Gemini 2.5 Pro

Best multimodal — text, image, code, audio, video

On-demand (ext.)

🥈

Cohere Command A Vision

Enterprise-focused image, chart, document understanding

On-demand + Ded.

🥉

Meta Llama 4 Maverick

Open-weight multimodal with MoE efficiency

On-demand + Ded.

⚡ High-Volume / Low-Latency

🥇

Google Gemini 2.5 Flash-Lite

Fastest + cheapest in Gemini family; 1M context

On-demand (ext.)

🥈

xAI Grok 3 Mini Fast

Lightweight thinker, lowest latency xAI model

On-demand (ext.)

🥉

OpenAI gpt-oss-20b

Consumer-grade hardware optimized, fast reasoning

On-demand + Ded.

🏢 Enterprise Fine-Tuning

🥇

Cohere Command R 08-2024

T-Few + Vanilla fine-tuning on dedicated AI clusters

Dedicated only

🥈

Meta Llama 3.3 70B

LoRA fine-tuning, best 70B text performance

On-demand + Ded.

🌐 Multilingual Applications

🥇

Cohere Command A

Native multilingual support, 256K context, high-throughput production

On-demand + Ded.

🥈

Meta Llama 4 Scout / Maverick

Open-weight multilingual models with strong cross-language performance

On-demand + Ded.

🔤

Cohere Embed Multilingual 3 / Embed 4

Semantic search in 100+ languages; Gen 4 adds multimodal

On-demand + Ded.

📄 Long-context Document Analysis

🥇

xAI Grok 4.1 Fast

2M tokens — process full codebases or entire document archives in one pass

On-demand (ext.)

🥈

Google Gemini 2.5 Pro

1M+ context with multimodal; ideal for large PDFs, reports, mixed media

On-demand (ext.)

🥉

xAI Grok 4 Fast

2M context at optimized cost; great for batch document workloads

On-demand (ext.)

🇪🇺 EU Data Sovereignty (Frankfurt · London)

On-demand + Dedicated — EU-FRA & EU-LON

🥇

Cohere Command A

256K context, multilingual, RAG + agentic — widest EU access

EU-FRA + EU-LON

🥈

Meta Llama 3.3 70B

Open-weight, LoRA fine-tunable, strong text performance

EU-FRA + EU-LON

🥉

OpenAI gpt-oss-120b / 20b

OpenAI-compatible API, advanced reasoning — Frankfurt on-demand + dedicated

EU-FRA + EU-LON

Dedicated Only — EU-FRA & EU-LON

🔒

Cohere Command A Reasoning

Advanced reasoning over documents — tenancy-exclusive GPUs in both EU regions

Dedicated · EU-FRA + EU-LON

🔒

Meta Llama 4 Maverick / Scout

Latest Llama 4 multimodal models — EU-LON dedicated only

Dedicated · EU-LON

🔤

Cohere Embed 3 / 4 + Rerank 3.5

Semantic search & reranking with EU data residency

Dedicated · EU-FRA + EU-LON

On-demand only (ext.) — EU-FRA

🌐

Google Gemini 2.5 Pro / Flash / Flash-Lite

External API routed through EU-FRA endpoint — data does not reside on OCI hardware

On-demand (ext.) · EU-FRA

⚠️

xAI Grok — No EU Presence

All xAI Grok models are US-only (Ashburn, Chicago, Phoenix) — not suitable for EU data residency requirements

Not available in EU

🏗️ Dedicated AI Clusters

🔒

Required for Fine-Tuning

Fine-tuning jobs run exclusively on dedicated GPU clusters — Cohere T-Few/Vanilla and Meta LoRA cannot run on-demand

Dedicated onlyCohere Command RLlama 3.3 70B

🏢

Data Residency & Compliance

Tenancy-exclusive GPUs; your data never shares hardware — suited for regulated industries (GDPR, HIPAA, financial)

DedicatedAll CohereAll MetaOpenAI gpt-oss

⚠️

Not Available — Google & xAI

Google Gemini and xAI Grok route through external APIs — dedicated clusters are not supported for these providers

On-demand (ext.) only

⚡ On-demand (No Cluster Needed)

🚀

Google Gemini & xAI Grok — Always On-demand

External API call; no cluster provisioning, instant availability, pay-per-token billing

On-demand (ext.)

🌍

Cohere & OpenAI gpt-oss — On-demand in Select Regions

On-demand access available without a dedicated cluster — ideal for PoC and variable workloads

On-demand + Ded.

📝

Fine-Tuning Requires Dedicated

On-demand mode supports inference only — to fine-tune a model you must provision a dedicated AI cluster first

Dedicated required

LEGEND

★ Flagship / Best-in-class

◉ Balanced / Advanced

▷ Speed / Efficiency tier

○ Lightweight / Budget

◆ Specialized

2M/1M+ Context ≥ 1M tokens

256K Context = 192K–512K tokens

128K Context = 128K tokens

✓ Feature supported

✗ Not supported

MoE Mixture of Experts (sparse activation)

US-CHI Region: on-demand + dedicated

EU-FRA Region: dedicated AI clusters only

US-ASH Region: on-demand / external call only

¹ Parameter counts are shown only when officially disclosed by the provider. Proprietary models (Google Gemini, xAI Grok) do not publish parameter counts and are omitted. "—" means not publicly disclosed.

² Fine-tuning on OCI uses dedicated AI clusters (GPU resources belonging exclusively to your tenancy). Cohere supports T-Few & Vanilla strategies; Meta Llama supports LoRA.

³ Retired/deprecated models (Command R legacy, Llama 3 70B, Llama 3.1 70B) are omitted from the main tables.

⁴ Model Import feature (GA 2025) lets you bring your own LLMs from Hugging Face or OCI Object Storage.

⁶ Grok 4.1 Fast context window (2M tokens) is confirmed by OCI documentation. Grok 4 and Grok 4 Fast context windows are not disclosed in OCI official docs.

⁵ Data sources (OCI Official Documentation, last updated 13 March 2026): Pretrained Models · Models by Region · Inferencing Modes · Model Import

    ⚠ AI-generated content — not an official Oracle document. This page was assembled with AI assistance from OCI public documentation. Data may contain errors or be out of date. Always verify against docs.oracle.com before making production decisions.
  

    ⌥ GitHub Repository
    { } models.json
    @ enricopesce