The future of high performance computing in the cloud

The future of high performance computing (HPC) is here with high performance computing on Oracle Cloud Infrastructure (OCI). HPC solutions are helping to solve some of the world’s toughest problems—for example, powering molecular modeling to combat diseases and simulating car crashes to help improve car safety and reduce motor vehicle fatalities. Using the cloud to run complex simulations and analyze mathematical models makes high performance computing accessible to more scientists, engineers, and analysts since it costs a fraction of what it does to run HPC using on-premises servers.

Upgrading to a second-generation cloud HPC system offers faster and more-powerful processors. “We expect to offer 30% better per-core performance for crash simulations, computational fluid dynamics, and electronic design automation workloads,” said Clay Magouyrk, executive vice president of Oracle Cloud Infrastructure, at an Oracle Live event focused on breakthroughs in cloud infrastructure and HPC. Oracle’s new level of performance for such processing-intensive workloads is possible in part because of decade-long partnerships with top innovators in the microprocessor world.

Fig. 1. The future of high performance computing in the cloud

HPC delivers faster analytics and more throughput

The launch of Oracle Exadata X8M with Intel Optane persistent memory gave “Oracle’s customers 10 times faster analytics and two and a half times more throughput” compared to Oracle’s first-generation HPC, said Intel CEO Bob Swan during the Oracle Live event.

Today customers turn to Oracle and Intel to help them run performance-intensive HPC jobs on demand instead of having to buy fixed, on-premises capacity. “Oracle’s bare metal offering combined with Intel’s optimized Xeon silicon allows Oracle applications to handle far more users, operate on larger active datasets, [achieve] faster data processing, and respond in real time to complex queries,” Swan said.

We expect to offer 30% better per-core performance for crash simulations, computational fluid dynamics, and electronic design automation workloads.

Clay Magouyrk Executive Vice President of Oracle Cloud Infrastructure, Oracle

The accelerated computing revolution

As businesses create ever-larger machine learning models to solve their toughest problems, the need for more compute, memory, and networking resources has kick-started a revolution in faster and more powerful graphics processor designs.

NVIDIA’s new A100 GPUs with Mellanox Direct Connect are now available with Oracle’s new HPC offering, helping customers get accelerated computing from extremely large networking clusters of up to 512 GPUs.

But accelerated computing requires people to have immediate access to huge libraries of domain-specific software, explained Jensen Huang, founder and CEO of NVIDIA, during the Oracle Live event. NVIDIA’s CUDA-X AI provides software to build deep learning apps for a range of use cases. Whether it’s crash simulations, genomics processing, or deep learning and analytics, each process requires its own domain-specific software stack.

“We’ve created this registry in the cloud that has all of these really complex software stacks, perfectly tuned, updated, and all containerized, so that all you have to do is pull it into the OCI instance and spool up your machine,” Huang said.

All of this technology has been used to help build Oracle’s “cluster-scale” high performance computing platform, enabling companies of all sizes to run massive workloads in the cloud. Whether those workloads involve simulating drug molecules or running machine learning algorithms for a human resource application’s chatbot, high performance computing is now ready for the masses.

“We’re going to put this technology into the hands of enterprise customers all over the world,” Huang said. “It’s a great next adventure for us, and it’s really great to do it with Oracle.”

Compute-intensive and hyperscale processing for Arm-based workloads

Through its partnership with Ampere, Oracle will offer the chipmaker’s newly released Altra server processors—80-core CPUs designed specifically for Arm-based workloads that involve compute-intensive, hyperscale processing.

“Our 80-core approach with a single core per instance allows for a level of isolation and security that people are excited about,” said Renée James, founder, chairman, and CEO of Ampere Computing, at the Oracle Live event. James added that these new CPUs also allow for a level of density and scalability that’s oriented around how people sell, use, and develop cloud instances today.

With its bare metal servers, virtual machines, containers, block storage, and a cluster network of more than 20,000 cores, Oracle’s ultra-low latency clocks in at just two microseconds.

Higher performance computing with AMD and Oracle

To help customers meet their data processing and system memory requirements, Oracle started offering AMD’s new Milan processors in 2021. The new CPU shape is offered at the same price as Oracle’s current AMD EPYC E3 shape, but it offers much higher performance.

The Milan processor “is designed for the cloud and will provide unmatched computing performance and business value to power the world’s most important cloud workloads and services,” said Lisa Su, president and CEO of AMD, during the Oracle Live webcast.