Dedicated processing: the next wave of acceleration


As Big Data grows exponentially, our ability to handle complex workloads diminishes. Jonathan Friedmann, co-founder and CEO of Speedata, shares some of the most common workloads that processors run and the hardware needed to accelerate them.

2.5 quintillion bytes of data are generated daily, and estimates suggest that big data will continue to grow by 23% per year. This trend has imbued nearly every corner of the economy – from airlines, banks, and insurance companies to government institutions, hospitals, and telecommunications companies have embraced big data analytics to improve business intelligence, promote growth, and streamline efficiency.

As Big Data is only growing, the tools used to analyze all of this data need to be scaled. However, the computer chips currently used to handle large or complex workloads are not up to the task because they require so much that the costs outweigh the benefits and hamper computing efficiency.

Therefore, despite all its benefits, the data explosion creates multiple challenges for the high-tech industry. The key to overcoming this challenge is to build processing power from every angle.

To do this, a wave of specialized, domain-specific accelerators have been developed to offload workloads from the CPU, the traditional workhorse of computer chips. These “alternative” accelerators are designed for specific tasks, trading the flexibility and general-purpose capabilities of standard CPU computing in exchange for better accelerated performance for those designated tasks.

The following is a short guide to some of the major acceleration areas and their corresponding accelerators.

Hardware for AI and ML Workloads

Artificial intelligence is changing the way we calculate and, therefore, the way we live. But early AI scans were forced to run on CPU chips that were much better suited to single-threaded tasks and certainly not designed for the parallel multitasking demanded by AI.

Enter: Graphics Processing Units (GPUs).

GPUs were born in the gaming industry to accelerate graphics workloads. A single GPU combines multiple specialized cores that work in tandem, allowing it to support parallel programs with simple control flow. This is perfect for graphics workloads, i.e. computer games, as they contain images with millions of pixels, which needed to be computed in parallel, independently. Processing these pixels also requires floating point vector multiplications which the GPU was designed to handle extremely well.

The discovery that GPUs could also be used to process AI workloads opened new horizons for AI data management. Although the application is very different from graphics workloads, AI/Machine Learning (ML) workloads have, in many ways, similar computational requirements, requiring efficient multiplication of floating-point matrices. Over the past decade, as AI and ML workloads have skyrocketed, GPUs have undergone substantial upgrades to better meet this growing demand.

Later, companies developed dedicated application-specific integrated circuits (ASICs) to cope with this heavy workload with the aim of ushering in the second wave of AI acceleration. ASICs at the forefront of AI acceleration include the TPU, Google’s tensor processing unit used primarily for inference; the IPU, Graphcore’s intelligence processing unit; and the RDU, SambaNova’s reconfigurable data flow unit.

Data processing workloads

Data Processing Units (DPUs) are essentially Network Interface Controllers (NICs) – hardware that connects a given device to the digital network. These ASICs are explicitly designed to offload protocol networking functions from the CPU and higher-layer processing like encryption or storage-related operations.

The companies have developed various DPUs, including Mellanox, acquired by Nvidia, and Persando, acquired by AMD. Although their architecture varies and the exact network protocol of each offload differs, all DPU variants have the same end goal of speeding up data processing and offloading the network protocol from the processor.

While Intel’s DPU has been given its acronym – IPU (Infrastructure Processing Unit), it belongs to the DPU family. The IPU is designed to improve data center efficiency by offloading functions that would traditionally have been performed on a processor, such as network control, storage management, and security.

Big data analytics

It is in databases and data analytics that big data truly produces actionable insights. As with the above workloads, CPUs have long been considered the norm. But as the scale of data analysis workloads continues to grow, these CPU functions have become exponentially less efficient.

Big data analytics workloads have many unique characteristics, including data structure and format, data encoding and types of processing operators, as well as requirements for intermediate storage, E /S and memory. This allows a dedicated accelerator ASIC that aims to optimize workloads with these specific characteristics to provide significant acceleration at a lower cost than traditional processors. Despite this potential, no chip has emerged over the past decade as the natural processor successor for analytics workloads. Bottom line: Until now, dedicated accelerators have done big data analysis poorly.

Analytical workloads are typically programmed with Structured Query Language (SQL), but other high-level languages ​​are also very common. Analytical engines that handle such workloads are numerous and include open source engines like Spark and Presto, as well as managed services like Databricks, Redshift, and Big Query.

Speedata created an Analytical Processing Unit (APU) to accelerate analytical workloads. With the explosion of data, insights from these emerging tools have the potential to unlock incredible value across industries.

Learn more: How chatbots simplify data analytics consumption for decision makers

Respect the process

There is no single solution for all of today’s computing needs.

Instead, the once ubiquitous processor is evolving into a “system controller” that hands out complex workloads – data analysis, AI/ML, graphics, video processing, etc. – to specialized units and accelerators.

Enterprises, in turn, tailor their data centers with such processing units strategically tailored to their workload needs. This increased level of customization will not only improve data center efficiency and effectiveness, but will also minimize costs, reduce energy consumption and reduce real estate requirements.

For analysis, faster processing will also yield more insights from a larger amount of data, opening up new opportunities. With more processing options and new opportunities, the era of Big Data is just beginning.

How do you think dedicated processing can simplify the process of managing complex data workloads? Share with us on Facebook, Twitterand LinkedIn. We would like to know!

Image source: Shutterstock



About Author

Comments are closed.