Historically, GPUs have been the go-to for computer vision training, providing excellent performance for training different model types. But, GPU-optimized computing is not your only option for running computer vision models. State-of-the-art custom-designed CPUs such as the Intel “Ice Lake” component are promising alternatives to GPU-optimized computing.
In this guide, we are going to compare the Intel c6i Ice Lake Amazon Web Services (AWS) Instance and the Intel Sapphire Rapids R7iz instance types against three other common AWS GPU instances. Without further ado, let’s begin. In the video below, we walk through our findings. These findings are documented in more depth later in the post, accompanied by more information about the Ice Lake processor.
What is the Intel c6i “Ice Lake” CPU?
Amazon EC2 C6i (“Ice Lake”) instances are powered by 3rd Generation Intel Xeon Scalable processors, proven to deliver up to 15% better price performance than C5 instances for a wide variety of workloads. The Intel C6i AWS instance type is a compute-optimized instance offering that is designed to provide an excellent balance of compute resources and cost.
There are many features of the C6i instance that make it a compelling alternative for computer vision applications.
First, C6i instances feature a 2:1 ratio of memory to vCPU, similar to C5 instances. But, the C6i instances support up to 128 vCPUs per instance, 33% more than C5 instances. This will give you faster performance on many compute-intensive applications, from training computer vision models to processing data.
C6i instances feature twice the networking bandwidth of C5 instances, making them an ideal fit for compute-intensive workloads. This includes batch processing, distributed analytics, high performance computing (HPC), ad serving, highly scalable multiplayer gaming, and video encoding.
They were released into general availability in October of 2021 and are available in the following 9 sizes:
All C6i instances offer:
- Memory capacity: New larger sizes with up to 128 vCPUs and 256 GiB of memory that you can use to consolidate workloads on fewer instances.
- High storage capacity: Up to 7.6 TB of local NVMe-based SSD block-level storage, which makes for a great instance type for handling large datasets.
- EBS Storage: Access to up to 80 Gbps Amazon Elastic Block Store (EBS) bandwidth
- High local storage throughput: fast local storage throughput of up to 2.1 GB/s.
- High network throughput: up to 200 Gbps network bandwidth, up to 2x higher to comparable C5n instances.
- Enhanced efficiency & security: C6i instances are built on the AWS Nitro System, a combination of dedicated hardware and lightweight hypervisor. AWS Nitro delivers almost all of the compute and memory resources of the host hardware to your instances for better overall performance and security.
Below, we will compare the c6i.2xlarge instance against several of the most commonly used GPU instances. Our goal is to demonstrate the performance of Intel hardware for computer vision inference on CPU compared to GPU.
What is the Intel c6i “Sapphire Rapids” CPU?
R7iz instances are the first EC2 instances powered by 4th generation Intel Xeon Scalable processors (code named Sapphire Rapids) with an all core turbo frequency up to 3.9 GHz.
Sapphire Rapids is designed to provide high-performance computing capabilities and are ideal for for artificial intelligence, cloud computing, high-performance computing (HPC), data analytics simulations, and other workloads requiring a combination of high compute performance and high memory footprint.
These instances have the highest performance per vCPU among x86-based EC2 instances, and they deliver up to 20% higher performance than the older z1d instances. The instances are built on the AWS Nitro System, a combination of dedicated hardware and lightweight hypervisor that delivers practically all of the compute and memory resources of the host hardware to your instances for better overall performance and security.
For increased memory and scalability, R7iz instances are available in various sizes, including two bare metal sizes, with up to 128 vCPUs and up to 1,024GiB of memory. R7iz instances are the first x86-based EC2 instances to use DDR5 memory and deliver up to 2.4x higher memory bandwidth than comparable high-frequency instances. They also deliver up to 50 Gbps of networking speed and 40 Gbps of Amazon Elastic Block Store (EBS) bandwidth.
In the world of large language models, Sapphire Rapids is ideal for fine-tuning your pre-trained transformer models.
These instances are presently in preview, and you can request access here.
Below, we will compare the Ice Lake c6i.2xlarge instance, the Sapphire Rapids r7iz.2xlarge instance, against several of the most commonly used GPU instances. Our goal is to demonstrate the performance of Intel hardware for computer vision inference on CPU compared to GPU.
Testing Process
To ensure that we make fair comparisons, we used the parameters and methods documented below across all of our benchmarking experiments.
Single Inference Tests
First, we performed single inference tests on a single image with the following characteristics:
- A width of 393px and a height of 487px.
- One annotation file containing data for a class named “helmet”.
- Inference was performed on a hosted Roboflow endpoint using the “ROBOFLOW 2.0 OBJECT DETECTION (FAST)” model.
Multiple Inference Tests
We then conducted multiple inference tests with the same 100 images across each instance. The testing dataset has the following characteristics:
- Images varied in size from ~400x400 to ~600x600 pixels
- The number of annotations in a file ranged from one to three objects.
- Inference was performed on a hosted Roboflow endpoint using the “ROBOFLOW 2.0 OBJECT DETECTION (FAST)” model.
We used the “mi-003f25e6e2d2db8f1” AWS GPU image for GPU testing. We used the “ami-0574da719dca65348” Ice Lake image for testing with the Intel Ice Lake CPU.
Findings
After completing our benchmarks using the aforementioned specifications, we arrived at the conclusions documented in the table below.
The numbers show that the c6i.2xlarge does not offer the highest possible performance in terms of inference speeds. But, the c6i.2xlarge instance provides the best cost-to-performance ratio. This instance can be an excellent workhorse instance for general computer vision inference needs.
When bumping up to something more expensive, one should consider that the cost increase doesn't scale linearly with performance increase. Higher costs lead to diminishing returns.
Conclusion
The Intel c6i “Ice Lake” CPU chip is a great NVIDIA alternative for consumers looking for good performance at a reasonable price. As an AWS instance type, the c6i offers a great balance of price to performance without the added overhead of renting a GPU instance and comes in a standard range of instance sizes to meet your specific usage requirements.
The Intel Sapphire Rapids CPU chip has even better inference acceleration than that of the Ice Lake chip. Although these instances are in preview within AWS, I suspect they will come in at a similar price point to the Ice Lake instance, providing yet another great, lower cost alternative to GPU’s.
From an AWS management perspective, running CPU instances over GPU instances keeps things simpler and alleviates the common GPU availability issues that the most hotly demanded GPUs run into. But, if you need the highest possible inference speed, a GPU-based instance may be a more appropriate choice.
Cite this Post
Use the following entry to cite this post in your research:
Jay Lowe. (Dec 23, 2022). Intel Ice Lake and Sapphire Rapids on AWS. Roboflow Blog: https://blog.roboflow.com/aws-ice-lake-comparison/
Discuss this Post
If you have any questions about this blog post, start a discussion on the Roboflow Forum.