Weird bitset shift offset in zend_alloc

  104592
March 6, 2019 00:27 smalyshev@gmail.com (Stanislav Malyshev)
Hi!

I've been working on running PHP with undefined behavior sanitizer
(http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html) and I've
encountered a weird error while running PHP:

/src/php-src/Zend/zend_alloc.c:585:9: runtime error: shift exponent 138
is too large for 64-bit type 'zend_mm_bitset' (aka 'unsigned long')
    #0 0x86dada in zend_mm_bitset_is_set
/src/php-src/Zend/zend_alloc.c:585:9
    #1 0x86dada in zend_mm_bitset_is_free_range
/src/php-src/Zend/zend_alloc.c:665
    #2 0x86dada in zend_mm_realloc_heap /src/php-src/Zend/zend_alloc.c:1670
    #3 0x86dada in _erealloc2 /src/php-src/Zend/zend_alloc.c:2577

Looks like the code is doing it intentionally:

/* x86 instructions BT, SHL, SHR don't require masking */
#if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__)) ||
defined(ZEND_WIN32)
# define ZEND_BIT_TEST(bits, bit) (((bits)[(bit) /
(sizeof((bits)[0])*8)] >> (bit)) & 1)
#else
# define ZEND_BIT_TEST(bits, bit) (((bits)[(bit) /
(sizeof((bits)[0])*8)] >> ((bit) & (sizeof((bits)[0])*8-1))) & 1)
#endif

But I'm not sure how it's supposed to work. Is it correct that on GCC
(and clang, presumably, since it defines __GNUC__) accept long bitshifts
and do the right thing with argument like 138? Is it documented
anywhere? Or is there a bug here?

Thanks,
-- 
Stas Malyshev
smalyshev@gmail.com
  104595
March 6, 2019 08:09 nikita.ppv@gmail.com (Nikita Popov)
On Wed, Mar 6, 2019 at 1:28 AM Stanislav Malyshev <smalyshev@gmail.com>
wrote:

> Hi! > > I've been working on running PHP with undefined behavior sanitizer > (http://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html) and I've > encountered a weird error while running PHP: > > /src/php-src/Zend/zend_alloc.c:585:9: runtime error: shift exponent 138 > is too large for 64-bit type 'zend_mm_bitset' (aka 'unsigned long') > #0 0x86dada in zend_mm_bitset_is_set > /src/php-src/Zend/zend_alloc.c:585:9 > #1 0x86dada in zend_mm_bitset_is_free_range > /src/php-src/Zend/zend_alloc.c:665 > #2 0x86dada in zend_mm_realloc_heap /src/php-src/Zend/zend_alloc.c:1670 > #3 0x86dada in _erealloc2 /src/php-src/Zend/zend_alloc.c:2577 > > Looks like the code is doing it intentionally: > > /* x86 instructions BT, SHL, SHR don't require masking */ > #if defined(__GNUC__) && (defined(__i386__) || defined(__x86_64__)) || > defined(ZEND_WIN32) > # define ZEND_BIT_TEST(bits, bit) (((bits)[(bit) / > (sizeof((bits)[0])*8)] >> (bit)) & 1) > #else > # define ZEND_BIT_TEST(bits, bit) (((bits)[(bit) / > (sizeof((bits)[0])*8)] >> ((bit) & (sizeof((bits)[0])*8-1))) & 1) > #endif > > But I'm not sure how it's supposed to work. Is it correct that on GCC > (and clang, presumably, since it defines __GNUC__) accept long bitshifts > and do the right thing with argument like 138? Is it documented > anywhere? Or is there a bug here? >
This is a bug, yes. Oversize shifts are UB, and the only thing preventing this from being miscompiled is the fact that the compiler cannot figure out that the shift is oversized. I'm not sure why this code was introduced, as the compiler should generally be able to eliminate this masking if it is unnecessary. See for example these isel patterns in clang: https://github.com/llvm-mirror/llvm/blob/46b09a3368af1be5005d31fd1d70bad08df352f9/lib/Target/X86/X86InstrCompiler.td#L1753 Nikita
  104596
March 6, 2019 08:14 smalyshev@gmail.com (Stanislav Malyshev)
Hi!

> But I'm not sure how it's supposed to work. Is it correct that on GCC > (and clang, presumably, since it defines __GNUC__) accept long bitshifts > and do the right thing with argument like 138? Is it documented > anywhere? Or is there a bug here? > > > This is a bug, yes. Oversize shifts are UB, and the only thing > preventing this from being miscompiled is the fact that the compiler > cannot figure out that the shift is oversized. > > I'm not sure why this code was introduced, as the compiler should > generally be able to eliminate this masking if it is unnecessary. See > for example these isel patterns in clang: > https://github.com/llvm-mirror/llvm/blob/46b09a3368af1be5005d31fd1d70bad08df352f9/lib/Target/X86/X86InstrCompiler.td#L1753
This was introduced by Dmitry in https://github.com/php/php-src/commit/4ad9cf460595efd1151faec0780b6ae5a4e0bc57, so I wonder how that code works in allocators... -- Stas Malyshev smalyshev@gmail.com