Generative AI Service Features


Models

Models available from Cohere and Meta for OCI Generative AI include

  • Meta Llama models: The latest include the flagship foundation Llama 3.1 405B model, which allows for the widest range of use cases; Llama 3.2 multimodal models for use with images; and Llama 3.3 70B, with improved cost performance for text-only applications. API tool support is available for Llama models.
  • Cohere Command R: Part of a new category of scalable models, Command R aims to balance high efficiency with strong accuracy for retrieval-augmented generation (RAG) applications. Compared with original Cohere models, Command R offers higher throughput and lower latency, a larger context window, and strong performance across 10 languages.
  • Cohere Command R+: Command R+ is enhanced with additional training of Command R for more specialized use cases. Command R+ has a deeper understanding of language and the ability to generate more nuanced and contextually appropriate responses. Use Command R+ for use cases such as generating long-form content, summarization, question answering, and language generation for specific domains or industries.
  • Cohere Embed: These English and multilingual embedding models (v3) convert text to vector embeddings representation. "Light" versions of Embed are smaller and faster (English only).

Dedicated AI clusters

With dedicated AI clusters, you can host foundational models on dedicated GPUs that are private to you. These clusters provide stable, high-throughput performance that’s required for production use cases and can support hosting and fine-tuning workloads. OCI Generative AI enables you to scale out your cluster with zero downtime to handle changes in volume.

Chat API and Playground

The chat experience provides an out-of-the box interface with Cohere and Meta models where users can ask questions and get conversational responses via the OCI console or API.

LangChain integration

OCI Generative AI is integrated with LangChain, an open source framework that can be used to develop new interfaces for generative AI applications based on language models. LangChain makes it easy to swap out abstractions and components that are necessary to work with language models.

LlamaIndex integration

Use LlamaIndex, an open source framework for building context-augmented applications, with OCI Generative AI to easily build RAG solutions or agents. Bring your solutions from prototype to production with custom data sources and flexible tooling.

Generative AI operations

OCI Generative AI provides content moderation controls, and coming soon: endpoint model swap with zero downtime, and endpoints deactivation and activation capabilities. For each model endpoint, OCI Generative AI captures a series of analytics, including call statistics, tokens processed, and error counts.

OCI Generative AI for Oracle Fusion Cloud Applications

By embedding features created with OCI Generative AI directly into Oracle Cloud Applications, we make it easy for customers to instantly access them without complex integrations.

Learn more

Oracle Chatbot
Disconnected