Re: Weird bitset shift offset in zend_alloc

  104597
March 6, 2019 08:47 smalyshev@gmail.com (Stanislav Malyshev)
Hi!

> This is intended x86 specific optimization. The behavior of SHR is > documented. > > See https://c9x.me/x86/html/file_module_x86_id_285.html > > "The count is masked to 5 bits, which limits the count range to 0 to 31" > on 32-bit systems.
I see. But I wonder whether the compiler will always compile it as SHR, is it guarnateed? Looking at optimization behavior for clang, for me ZEND_BIT_TEST( bits, 65 ) - where bits is an array of 0xFF - produces 1 under -O0 but 0 under -O1 (I guess it figures out it's a constant and optimizes the shift away). So I wonder whether this construct isn't dangerous... -- Stas Malyshev smalyshev@wikimedia.org
  104600
March 6, 2019 09:24 nikita.ppv@gmail.com (Nikita Popov)
On Wed, Mar 6, 2019 at 9:47 AM Stanislav Malyshev <smalyshev@gmail.com>
wrote:

> Hi! > > > This is intended x86 specific optimization. The behavior of SHR is > > documented. > > > > See https://c9x.me/x86/html/file_module_x86_id_285.html > > > > "The count is masked to 5 bits, which limits the count range to 0 to 31" > > on 32-bit systems. > > I see. But I wonder whether the compiler will always compile it as SHR, > is it guarnateed? Looking at optimization behavior for clang, for me > ZEND_BIT_TEST( bits, 65 ) - where bits is an array of 0xFF - produces 1 > under -O0 but 0 under -O1 (I guess it figures out it's a constant and > optimizes the shift away). So I wonder whether this construct isn't > dangerous... >
This is the general problem with undefined behavior: Just because something is UB, doesn't necessarily mean that it will *always* be micompiled. If you write ZEND_BIT_TEST(bits, 65), then the shift amount is known to be too large and the computation is folded to poison. In ZEND_BIT_TEST(bits, bit) where the shift amount if not known, it is quite likely that no optimizations will be able to apply and this will indeed be compiled to SHR with the desired making behavior. But unlikely doesn't mean it can't happen. Even if the input is dynamic, known bits can prove that one of the top bits is set and fold everything to undef. Here is a quick example: https://godbolt.org/z/C3rSSW That's why we really shouldn't be doing such optimizations. Regards, Nikita