Intel Optimization is bad@$$

2 min readAug 12, 2022

If you’ve never seen the Intel Software Developer Manuals (4800+ pages), it’s about time.

Optimization is not for everybody. I remember when I was picking up my programming skill in college, I had to really understand memory usage and spent hours on algorithm problems to even make things run. Today, often I see lots of code written in Python and optimization is an afterthought. In the world of AI/ML, this is where it gets much worst!

The real question now is … why? Well, to get started, I think let’s see why it’s so hard to optimize anything.

Why optimization is so hard?

It’s hard because there are so many ways (to go wrong)!

Many ways to make things faster! both in software and hardware

2. It takes lots of trial and error! Profiling, testing, and not all solutions are scalable.

3. You don’t get paid. That’s right. Often, you don’t get paid for optimizing code. It’s often very difficult to justify unless you have a working product in the market already.

Optimization is often not the top priority.

So, if optimization is so hard, what’s the best way?

Learn the options. Many people do not realize the hardware potential and always thought it is impossible to get faster with what you already have. e.g., AVX-512 and other extensions can make a world of difference. https://www.intel.com/content/dam/www/public/us/en/documents/product-overviews/dl-boost-product-overview.pdf
Use the right tool! How many times do you really need to rewrite your code? The short answer is ‘never’ if you pick the right tool and ecosystem! If you started with OpenVINO, you can pick and choose between different hardware seamlessly without rewriting any of the code from scratch.
Choose the right data format. Yes, it’s sometimes as simple as picking the right precisions for the right job. Quantization with mixed precision is proven to be a huge performance booster!

Lower Numerical Precision Deep Learning Inference and Training

Most commercial deep learning applications today use 32-bits of floating point precision for training and inference…

www.intel.com

As promised, if you are really into optimizing and making things optimally (like a car tuner), it’s always great to read these 4800 pages. =) Just kidding.

Intel Optimization is bad@$$

Why optimization is so hard?

So, if optimization is so hard, what’s the best way?

Lower Numerical Precision Deep Learning Inference and Training

Most commercial deep learning applications today use 32-bits of floating point precision for training and inference…

Intel® 64 and IA-32 Architectures Software Developer Manuals

These manuals describe the architecture and programming environment of the Intel® 64 and IA-32 architectures…

Written by Raymond Lo, PhD

No responses yet