QCT Workload Optimization at Intel Innovation 2022

0
Granulate.io GProfiler Example

At Intel Innovation 2022, QCT and Intel showcased their workload optimization work. It wasn’t the flashiest demo, but looking at our notes again, it made more sense. When we look to next-generation platforms, as well as those deployed today, the impact of low optimization is huge in terms of performance and power consumption.

QCT on Application Profiling at Intel Innovation 2022

Here is QCT’s slide on profiling and tuning WRF. WRF, for those who don’t know, is a weather simulation tool. It is a well-known HPC workload that has dedicated clusters to run it. Something that can be seen is that the goal is optimization by running the software and seeing how it works.

QCT 1 WRF Tuning Optimization Process
QCT 1 WRF Tuning Optimization Process

This brings us to the question of how QCT does this with its customers. Something we didn’t cover much during Intel Innovation 2022, but it was an interesting conversation, was the conversation of Brendan Gregg (a former performance manager at Netflix). There he showed CPU flame graphs as a way to see the execution time spent on different parts of the stack.

Intel CPU Flame Graphs Intel Innovation 2022 Keynote Day 2
Intel CPU Flame Graphs Intel Innovation 2022 Keynote Day 2

Intel has extended this to go beyond just the CPU, but also for off-CPU flame graphics. HPC workloads often involve transferring data from one node to another, and so there is a performance strain on a cluster while waiting for things like moving data.

Intel Off CPU Flame Graphs Intel Innovation 2022 Keynote Day 2
Intel Off CPU Flame Graphs Intel Innovation 2022 Keynote Day 2

Intel also has CPI flame graphics. If you can see the dark blue bar below, it’s a process waiting for IO.

Intel CPI Flame Graphs Intel Innovation 2022 Keynote Day 2
Intel CPI Flame Graphs Intel Innovation 2022 Keynote Day 2

Earlier this year, Intel acquired a company called Granulate for $650 million. We have noticed that QCT uses Intel Granulate and gProfiler with its clients to detect these performance bottlenecks.

Intel Granulate QCT Tuning Optimization Process
Intel Granulate QCT Tuning Optimization Process

If you use gProfiler, you will see heavy use of flame graphics.

Granulate.io GProfiler Example
Granulate.io GProfiler Example

These flame graphs then help to fuel two different optimization paths. The first is code optimization, and that’s what Intel’s keynote at Innovation 2022 focused on. On the QCT side, it works with customers to use that data to tune systems and clusters.

QCT setting optimization levers
QCT setting optimization levers

It can be everything from used libraries, to basic BIOS settings, kernel settings and also cluster hardware.

Last words

When we are on trade shows it is often easy to understand the WRF simulation output as it usually involves weather forecast and map. The material is also easy to observe because it is in plain sight.

QCT QuantaGrid D54Q 2U Liquid Cooled Intel Innovation 2022 2
QCT QuantaGrid D54Q 2U Liquid Cooled Intel Innovation 2022 2

What we don’t focus on so much, and perhaps should, are tools like Intel’s Granulate gProfiler and the efforts to apply them by companies like QCT. Large, very large-scale clients often have sophisticated tools that allow them to do this work. Now smaller organizations have access to this software, and it looks like QCT is now offering a service.

Share.

About Author

Comments are closed.