c cpu gcc simd

Implementation of __builtin_clz

What is the implementation of GCC’s (4.6+) __builtin_clz? Does it correspond to some CPU instruction on Intel x86_64 (AVX)?

It should translate to a Bit Scan Reverse instruction and a subtract. The BSR gives the index of the leading 1, and then you can subtract that from the word size to get the number of leading zeros.

Edit: if your CPU supports LZCNT (Leading Zero Count), then that will probably do the trick too, but not all x86-64 chips have that instruction.