At Intel Innovation 2022, QCT and Intel showcased their workload optimization work. It wasn’t the flashiest demo, but looking at our notes again, it made more sense. When we look to next-generation platforms, as well as those deployed today, the impact of low optimization is huge in terms of performance and power consumption.
QCT on Application Profiling at Intel Innovation 2022
Here is QCT’s slide on profiling and tuning WRF. WRF, for those who don’t know, is a weather simulation tool. It is a well-known HPC workload that has dedicated clusters to run it. Something that can be seen is that the goal is optimization by running the software and seeing how it works.
This brings us to the question of how QCT does this with its customers. Something we didn’t cover much during Intel Innovation 2022, but it was an interesting conversation, was the conversation of Brendan Gregg (a former performance manager at Netflix). There he showed CPU flame graphs as a way to see the execution time spent on different parts of the stack.
Intel has extended this to go beyond just the CPU, but also for off-CPU flame graphics. HPC workloads often involve transferring data from one node to another, and so there is a performance strain on a cluster while waiting for things like moving data.
Intel also has CPI flame graphics. If you can see the dark blue bar below, it’s a process waiting for IO.
Earlier this year, Intel acquired a company called Granulate for $650 million. We have noticed that QCT uses Intel Granulate and gProfiler with its clients to detect these performance bottlenecks.
If you use gProfiler, you will see heavy use of flame graphics.
These flame graphs then help to fuel two different optimization paths. The first is code optimization, and that’s what Intel’s keynote at Innovation 2022 focused on. On the QCT side, it works with customers to use that data to tune systems and clusters.
It can be everything from used libraries, to basic BIOS settings, kernel settings and also cluster hardware.
When we are on trade shows it is often easy to understand the WRF simulation output as it usually involves weather forecast and map. The material is also easy to observe because it is in plain sight.
What we don’t focus on so much, and perhaps should, are tools like Intel’s Granulate gProfiler and the efforts to apply them by companies like QCT. Large, very large-scale clients often have sophisticated tools that allow them to do this work. Now smaller organizations have access to this software, and it looks like QCT is now offering a service.