Ivica Bogosavljević, Author at Johnny's Software Lab

Exposing More Parallelism Is the Hidden Reason Why Some Vectorized Loops Are Faster – Not Vectorization per se

February 26, 2026February 26, 2026Ivica BogosavljevićLow Level Performance, Performance, Vectorization2 Replies

I was preparing an article about Highway – portable vectorization library by Google – so I ported a few examples from my vectorization workshop from AVX to Highway. One of the examples was vectorized binary search. I assume most readers are familiar with simple binary search. It looks something like this: We take a lookup…

Read

Floating-Point Error Handling in C++: What Actually Works

January 31, 2026February 18, 2026Ivica BogosavljevićC++ Performance, Help the Compiler, PerformanceLeave a Reply

Floating-point errors are unavoidable, but how you detect and handle them can make the difference between clean, high-performance C++ code and a debugging nightmare. In this article, we explore the practical techniques for handling NaNs, infinities, and other FP errors — from manual checks to sticky bits and hardware traps — and reveal which approaches actually work without sabotaging performance.

Deep Dive in Java vs C++ Performance

November 30, 2025December 6, 2025Ivica BogosavljevićPerformanceLeave a Reply

For most of my career I lived in the world of C and C++, and I honestly believed that these languages are the pinnacle of software performance. But two months ago I started working at Azul, the maker of low-latency Java compiler and I had an opportunity to deep dive into Java performance. And it…

Read

9 Things Every Fresh Graduate Should Know About Software Performance

September 20, 2025February 17, 2026Ivica Bogosavljević2 Minute Reads, PerformanceLeave a Reply

At Johnny’s Software Lab we’ve spent a lot of time deep-diving into advanced performance topics — vectorization, cache hierarchies, memory bandwidth, you name it. But not everyone is ready to jump straight into assembly listings and microarchitectural details. This post is for the beginners. For the fresh graduates and junior developers who are just starting…

Read

The messy reality of SIMD (vector) functions

July 4, 2025September 7, 2025Ivica BogosavljevićPerformance, Toolchain and Performance, VectorizationLeave a Reply

We’ve discussed SIMD and vectorization extensively on this blog, and it was only a matter of time before SIMD (or vector) functions came up. In this post, we explore what SIMD functions are, when they are useful, and how to declare and use them effectively. A SIMD function is a function that processes more than…

Read

An optimizing compiler doesn’t help much with long instruction dependencies

May 31, 2025July 2, 2025Ivica Bogosavljević2 Minute Reads, Memory Subsystem Performance, Performance, Toolchain and Performance1 Reply

Does it matter if we are compiling with optimizations off (O0) or optimizations on (O3) if the problem is memory bound? Let’s find out…

Growing Buffers to Avoid Copying Data

March 31, 2025March 31, 2025Ivica BogosavljevićStandard Library and Performance1 Reply

Copying data can be expensive in some cases, especially since it it doesn’t change the data, it’s just moves it. Therefore we, engineers interested in performance, want to avoid copying data as much as possible. We already talked about avoiding data copying in C++ earlier. In that post, we talked about what mechanism C++ has…

Read

Performance Debugging with llvm-mca: Simulating the CPU!

January 31, 2025January 31, 2025Ivica BogosavljevićPerformance, Performance Analysis Tools3 Replies

We debug our performance problem by simulating it with llvm-mca!

FIYA – Flamegraphs in Your App

December 31, 2024February 6, 2026Ivica BogosavljevićDebuggingLeave a Reply

Flamegraphs are great way to visualize resource consumption in your program and I am their big fan (I have written about them on two occasions – here and here). My biggest concern with flamegraphs is when the tooling is bad or missing: to create flamegraphs, you need to have a good profiler and a binary…

Read

Memory Subsystem Optimizations – The Remaining Topics

October 31, 2024November 13, 2024Ivica BogosavljevićLow Level Performance, Memory Subsystem Performance, PerformanceLeave a Reply

This is the last memory optimization that we are covering in this blog. You can see the full list of all memory subsystem optimization that we covered earlier here. Definitely a read for anyone who is trying to improve performance of memory intensive software. In this post, we are covering a few remaining optimization techniques…

Read