Generative AI Service Features


Models

Models available from Cohere and Meta for OCI Generative AI include

  • Cohere Command R: Part of a new category of scalable models, Command R aims to balance high efficiency with strong accuracy for retrieval-augmented generation (RAG) applications. Compared with original Cohere models, Command R offers higher throughput and lower latency, a larger context window, and strong performance across 10 languages.
  • Cohere Command R+: Command R+ is enhanced with additional training of Command R for more specialized use cases. Command R+ has a deeper understanding of language and the ability to generate more nuanced and contextually appropriate responses. Use Command R+ for use cases such as generating long-form content, summarization, question answering, and language generation for specific domains or industries.
  • Cohere Embed: These English and multilingual embedding models (v3) convert text to vector embeddings representation. “Light” versions of Embed are smaller and faster (English only).
  • Meta Llama 3.1: Llama 3.1 models are cutting edge and open source with improved performance and response diversity. Improved capabilities include a 128K context window and support for eight languages. OCI Generative AI offers the Llama 3.1 70B and 405B models with support for fine-tuning using the low-rank adaptation (LoRA) method.
  • Meta Llama 3.2: Multimodal support allows these models to achieve image-based use cases, such as summarizing charts and graphs and writing captions for images and figures. In addition, Llama 3.2 models offer multilingual support for eight languages for text-only queries. OCI Generative AI offers both the Llama 3.2 90B model and the 11B model.

Dedicated AI clusters

With dedicated AI clusters, you can host foundational models on dedicated GPUs that are private to you. These clusters provide stable, high-throughput performance that’s required for production use cases and can support hosting and fine-tuning workloads. OCI Generative AI enables you to scale out your cluster with zero downtime to handle changes in volume.

Chat API and Playground

The chat experience provides an out-of-the box interface with Cohere and Meta models where users can ask questions and get conversational responses via the OCI console or API.

LangChain integration

OCI Generative AI is integrated with LangChain, an open source framework that can be used to develop new interfaces for generative AI applications based on language models. LangChain makes it easy to swap out abstractions and components that are necessary to work with language models.

LlamaIndex integration

Use LlamaIndex, an open source framework for building context-augmented applications, with OCI Generative AI to easily build RAG solutions or agents. Bring your solutions from prototype to production with custom data sources and flexible tooling.

Generative AI operations

OCI Generative AI provides content moderation controls, and coming soon: endpoint model swap with zero downtime, and endpoints deactivation and activation capabilities. For each model endpoint, OCI Generative AI captures a series of analytics, including call statistics, tokens processed, and error counts.

OCI Generative AI for Oracle Fusion Cloud Applications

By embedding features created with OCI Generative AI directly into Oracle Cloud Applications, we make it easy for customers to instantly access them without complex integrations.

Learn more