Optimize zend_string equality check with hash key

  105090
April 5, 2019 07:26 ben.coutu@zeyos.com (Benjamin Coutu)
Hello,

I was surprised to find out that equality checks on zend_string do not take advantage of the fact that in many cases we have a hash key available that can be utilized in a quick bailout path without having to resort to a costly `memcmp` on the value.

I recommend to modify `zend_string_equal_content` like so:

static zend_always_inline zend_bool zend_string_equal_content(zend_string *s1, zend_string *s2)
{
  if (ZSTR_LEN(s1) != ZSTR_LEN(s2)) {
    return 0;
  }

  if (ZSTR_H(s1) && ZSTR_H(s1) != ZSTR_H(s2)) {
    return 0;
  }

  return zend_string_equal_val(s1, s2);
}

Thoughts?

-- 

Benjamin Coutu
ben.coutu@zeyos.com
  105100
April 5, 2019 09:00 george.banyard@gmail.com ("G. P. B.")
On Fri, 5 Apr 2019, 09:26 Benjamin Coutu, coutu@zeyos.com> wrote:

> Hello, > > I was surprised to find out that equality checks on zend_string do not > take advantage of the fact that in many cases we have a hash key available > that can be utilized in a quick bailout path without having to resort to a > costly `memcmp` on the value. > > I recommend to modify `zend_string_equal_content` like so: > > static zend_always_inline zend_bool zend_string_equal_content(zend_string > *s1, zend_string *s2) > { > if (ZSTR_LEN(s1) != ZSTR_LEN(s2)) { > return 0; > } > > if (ZSTR_H(s1) && ZSTR_H(s1) != ZSTR_H(s2)) { > return 0; > } > > return zend_string_equal_val(s1, s2); > } > > Thoughts? > > -- > > Benjamin Coutu > ben.coutu@zeyos.com >
Best idea would be to do a PR on the GitHub repo. Best regards George Peter Banyard
  105152
April 8, 2019 16:55 ajf@ajf.me (Andrea Faulds)
Hi Benjamin,

Benjamin Coutu wrote:
> if (ZSTR_H(s1) && ZSTR_H(s1) != ZSTR_H(s2)) { > return 0; > }
Should that not be like this?: if (ZSTR_H(s1) && ZSTR_H(s2) && ZSTR_H(s1) != ZSTR_H(s2)) { It would be possible the other string also has no hash value. Otherwise this sounds sensible to me, but it is also so obvious that I was assuming until now that PHP does it, so maybe someone else knows a reason why we don't? (Maybe it's just not worth the extra branches?) Thanks, Andrea