HeatWave is an in-memory, massively parallel, hybrid columnar data-processing engine. It implements state-of-the-art algorithms for distributed query processing that provide very high performance.
HeatWave massively partitions data across a cluster of nodes, which can be operated in parallel. This provides excellent internodal scalability. Each node within a cluster and each core within a node can process partitioned data in parallel. HeatWave has an intelligent query scheduler that overlaps computation with network communication tasks to achieve very high scalability across thousands of cores.
Query processing in HeatWave has been optimized for commodity servers in the cloud. The sizes of the partitions have been optimized to fit the cache of the underlying shapes. The overlap of computation with communication is optimized for the network bandwidth available. Various analytics processing primitives use the hardware instructions of the underlying virtual machines (VMs). HeatWave is also designed to be a scale-out data processing engine, optimized to query data in object storage.
Oracle HeatWave GenAI provides integrated, automated, and secure generative AI with in-database large language models (LLMs); an automated, in-database vector store; scale-out vector processing; and the ability to have contextual conversations in natural language—letting you take advantage of generative AI without AI expertise, data movement, or additional cost.
Use the built-in LLMs in all Oracle Cloud Infrastructure (OCI) regions, OCI Dedicated Region, Oracle Alloy, Amazon Web Services (AWS), and Microsoft Azure and obtain consistent results with predictable performance across deployments. Help reduce infrastructure costs by eliminating the need to provision GPUs.
Access pretrained foundation models from Cohere and Meta via the OCI Generative AI service when using HeatWave GenAI on OCI and via Amazon Bedrock when using HeatWave GenAI on AWS.
Perform retrieval-augmented generation (RAG) across LLMs and your proprietary documents in various formats housed in HeatWave Vector Store to get more accurate and contextually relevant answers—without moving data to a separate vector database.
Leverage the automated pipeline to help discover and ingest proprietary documents in HeatWave Vector Store, making it easier for developers and analysts without AI expertise to use the vector store.
Vector processing is parallelized across up to 512 HeatWave cluster nodes and executed at memory bandwidth, helping deliver fast results with a reduced likelihood of accuracy loss.
Have contextual conversations informed by your unstructured documents in object storage using natural language. Use the integrated Lakehouse Navigator to help guide LLMs to search through specific data sets, helping you reduce costs while getting more accurate results faster.
HeatWave MySQL is a fully managed database service, and the only cloud service built on MySQL Enterprise Edition, with advanced security features for encryption, data masking, authentication, and a database firewall. HeatWave improves MySQL query performance by orders of magnitude and enables you to get real-time analytics on your transactional data in MySQL—without the complexity, latency, risks, and cost of extract, transform, and load (ETL) duplication to a separate analytics database.
Analytics queries access the most current information as updates from transactions automatically replicate in real time to the HeatWave analytics cluster. There’s no need to index the data before running analytics queries. You can eliminate the complex, time-consuming, and costly ETL process and integration with a separate analytics database.
HeatWave MySQL is faster and delivers better price-performance, as demonstrated by multiple standard industry benchmarks, including TPC-H, TPC-DS, and CH-benCHmark.
Snowflake, Amazon Redshift, Google BigQuery, Azure Synapse, Amazon Aurora, and Amazon RDS are slower and more expensive
HeatWave Lakehouse lets users query half a petabyte of data in object storage—in a variety of file formats, such as CSV, Parquet, Avro, JSON, and export files from other databases. The query processing is done entirely in the HeatWave engine, enabling customers to take advantage of HeatWave for non-MySQL workloads in addition to MySQL-compatible workloads.
As demonstrated by a 500 TB TPC-H benchmark, the query performance of HeatWave Lakehouse is
The data load performance of HeatWave Lakehouse is
Customers can query data in various formats in object storage, transactional data in MySQL databases, or a combination of both using standard SQL commands. Querying the data in object storage is as fast as querying the databases, as demonstrated by a 10 TB TPC-H benchmark.
With HeatWave AutoML, customers can use data in object storage, the database, or both to automatically build, train, deploy, and explain ML models—without moving the data to a separate ML cloud service.
HeatWave’s massively partitioned architecture enables a scale-out architecture for HeatWave Lakehouse. Query processing and data management operations, such as loading/reloading data, scale with the size of data. Customers can query up to half a petabyte of data in object storage with HeatWave Lakehouse without copying it to the MySQL database. The HeatWave cluster scales to 512 nodes.
HeatWave Autopilot capabilities, such as auto provisioning, auto query plan improvement, and auto parallel loading, have been enhanced for HeatWave Lakehouse, further reducing database administration overhead and improving performance. New HeatWave Autopilot capabilities are also available for HeatWave Lakehouse.
HeatWave AutoML includes everything users need to build, train, and explain machine learning models within HeatWave, at no additional cost.
With in-database machine learning in HeatWave, customers don’t need to move data to a separate machine learning service. They can easily and securely apply machine learning training, inference, and explanation to data stored both inside MySQL and in the object store with HeatWave Lakehouse. As a result, they can accelerate ML initiatives, increase security, and reduce costs.
HeatWave AutoML automates the machine learning lifecycle, including algorithm selection, intelligent data sampling for model training, feature selection, and hyperparameter optimization—saving data analysts and data scientists significant time and effort. Aspects of the machine learning pipeline can be customized, including algorithm selection, feature selection, and hyperparameter optimization. HeatWave AutoML supports anomaly detection, forecasting, classification, regression, and recommender system tasks, including on text columns. Users can provide feedback on the results of unsupervised anomaly detection and use this labeled data to help improve subsequent predictions.
By considering both implicit feedback (past purchases, browsing behavior, and so forth) and explicit feedback (ratings, likes, and so forth), the HeatWave AutoML recommender system can generate personalized recommendations. Analysts, for instance, can predict items that a user will like, users who will like a specific item, and ratings that items will receive. They can also, given a user, obtain a list of similar users and, given a specific item, obtain a list of similar items.
The interactive console lets business analysts build, train, run, and explain ML models using a visual interface—without using SQL commands or any coding. The console also makes it easy to explore what-if scenarios to evaluate business assumptions—for example, “How would investing 30% more in paid social media advertising affect both revenue and profit?”
Benchmarks demonstrate that, on average, HeatWave AutoML produces more accurate results than Amazon Redshift ML, trains models up to 25X faster at 1% of the cost, and scales as more nodes are added.
All the models trained by HeatWave AutoML are explainable. HeatWave AutoML delivers predictions with an explanation of the results, helping organizations with regulatory compliance, fairness, repeatability, causality, and trust.
Topic modeling helps users discover insights in large textual data sets by helping them understand key themes in documents, for example, to complete sentiment analysis on social media data. Data drift detection helps analysts determine when to retrain models by detecting the differences between the data used for training and new incoming data.
Developers and data analysts can build machine learning models using familiar SQL commands; they don’t have to learn new tools and languages. Additionally, HeatWave AutoML is integrated with popular notebooks, such as Jupyter and Apache Zeppelin.
HeatWave Autopilot provides workload-aware, machine learning–powered automation. It improves performance and scalability without requiring database tuning expertise, increases the productivity of developers and DBAs, and helps eliminate human errors. HeatWave Autopilot automates many of the most important and often challenging aspects of achieving high query performance at scale—including provisioning, data loading, query execution, and failure handling. HeatWave Autopilot is available at no additional charge for HeatWave MySQL customers.
HeatWave Autopilot provides numerous capabilities for both HeatWave and OLTP, including
Real-time elasticity enables customers to increase or decrease the size of their HeatWave cluster by any number of nodes without incurring any downtime or read-only time.
The resizing operation takes only a few minutes, during which time HeatWave remains online, available for all operations. Once resized, data is downloaded from object storage, automatically rebalanced among all available cluster nodes, and becomes immediately available for queries. As a result, customers benefit from consistently high performance, even at peak times, and lower costs by downsizing their HeatWave cluster when appropriate—without incurring any downtime or read-only time.
With efficient data reloading from object storage, customers can also pause and resume their HeatWave cluster to reduce costs.
Customers can expand or reduce their HeatWave cluster to any number of nodes. They aren’t constrained to overprovisioned and costly instances forced by rigid sizing models offered by other cloud database providers. With HeatWave customers pay only for the exact resources they use.
You can deploy HeatWave on OCI, AWS, or Azure. You can replicate data from on-premises OLTP applications to HeatWave to get near real-time analytics and process vector data in the cloud. You also can use HeatWave in your data center with OCI Dedicated Region.
HeatWave on AWS delivers a native experience for AWS customers. The console, control plane, and data plane reside in AWS.