OCI Speech is an AI service that both transcribes speech to text and synthesizes speech from text. Get accurate, text-normalized, time-stamped transcriptions and synthetized voice via the OCI Console, OCI Data Science notebooks, and REST APIs, as well as CLIs or SDKs.
Text-to-speech and real-time transcription features are now in limited availability. Discover how to create synthetized voice based on text and receive an accurate transcription instantaneously.
Learn how the components in a typical system interact to let OCI Speech transcribe natural language.
Build, test, and deploy applications on Oracle Cloud for free with a US$300 cloud credit.
OCI Speech uses automatic speech recognition, a deep learning process, to derive accurate transcription from natural conversations. Get started easily by using prebuilt acoustic and language models that don’t require existing data science experience.
Search, index, and decipher data buried in your audio files. Convert recorded audio conversations to textual data to analyze with AI services. For example, use OCI Language to retrieve the sentiment and OCI Speech’s anomaly detection capabilities to identify chances of customer churn.
Now in limited availability, OCI Speech’s real-time transcription feature lets you send audio streams and receive accurate transcriptions in seconds.
Now in limited availability, the text-to-speech feature in OCI Speech lets you synthesize human-like speech from text across applications. This feature enables customer conversations, multi-language voice translations, and improved accessibility. Choose from a variety of voices to enhance interactions.
OCI Speech ASR models support English, Spanish, Portuguese, German, French, Italian, and Hindi, allowing you to transcribe your audio files in your preferred language. In addition, OCI Speech also supports the OpenAI Whisper model, which provides 57+ supported languages out of the box. Find out more about OCI and the Whisper model.
OCI Speech supports diarization for organizing, analyzing, and extracting meaningful information from multiple speakers.
Eliminate reliance on third-party transcription offerings and practice more control over your data with end-to-end security and compliance.
OCI Speech is a versatile service that can be called via REST APIs, different SDKs, and Oracle CLI. Developers can easily deploy a scalable speech service without having data science or ML expertise.
Oracle Cloud Infrastructure Speech protects our customers’ privacy. Prebuilt automatic speech recognition models transcribe your content, but do not store any data for training, debugging, or other purposes.
OCI Speech uses proprietary models and architecture that enables fast conversion for speech into text.
We added a word-level confidence score to help you identify words that might have been transcribed incorrectly. Use the word confidence score to determine where to focus when building an application.
We added prebuilt word filtering using a curated list of profanities. You can either mask, remove, or tag profanities.
Our real-time speech recognition feature helps ensure that your speech is accurately transcribed as you speak naturally, allowing for seamless and uninterrupted communication.
Automatically provide in-workflow closed captioning on the OCI platform for all content created and curated by digital media services. Index your content using OCI Speech for easy searching across your content.
Transcribe customer calls for easy searching and retrieval of information. Use OCI Language to detect sentiment and help identify customer churn and staff training opportunities.
Real-time transcription lets physicians and nurses capture patient notes on the go, helping increase efficiency and improve care and outcomes.
Neural text-to-speech provides a high-accuracy, human-like voice with intonations, providing more options for accessibility features.
Build, test, and deploy applications on Oracle Cloud—for free.