Re: [PHP-DEV] Proposal for a new basic function: str_contains

This is only part of a thread. view whole thread
February 14, 2020 14:14 (Philipp Tanlak)
Am Fr., 14. Feb. 2020 um 12:54 Uhr schrieb Nikita Popov <>:

> On Fri, Feb 14, 2020 at 10:18 AM Philipp Tanlak> > wrote: > >> Hello PHP Devs, >> >> I would like to propose the new basic function: str_contains. >> >> The goal of this proposal is to standardize on a function, to check >> weather >> or not a string is contained in another string, which has a very common >> use-case in almost every PHP project. >> PHP Frameworks like Laravel create helper functions for this behavior >> because it is so ubiquitous. >> >> There are currently a couple of approaches to create such a behavior, most >> commonly: >> > strpos($haystack, $needle) !== false; >> strstr($haystack, $needle) !== false; >> preg_match('/' . $needle . '/', $haystack) != 0; >> >> All of these functions serve the same purpose but are either not >> intuitive, >> easy to get wrong (especially with the !== comparison) or hard to remember >> for new PHP developers. >> >> The proposed signature for this function follows the conventions of other >> signatures of string functions and should look like this: >> >> str_contains(string $haystack, string $needle): bool >> >> This function is very easy to implement, has no side effects or backward >> compatibility issues. >> I've implemented this feature and created a pull request on GitHub ( Link: >> ). >> >> To get this function into the PHP core, I will open up an RFC for this. >> But first, I would like to get your opinions and consensus on this >> proposal. >> >> What are your opinions on this proposal? >> > > Sounds good to me. This operation is needed often enough that it deserves > a dedicated function. > > I'd recommend leaving the proposal at only str_contains(), in particular: > > * Do not propose a case-insensitive variant. I believe this is really the > point on which the last str_starts_with/str_ends_with proposal failed. > > * Do not propose mb_str_contains(). Especially as no offsets are > involved, there is no reason to have this function. (For UTF-8, the > behavior would be exactly equivalent to str_contains.) > > Regards, > Nikita >
I like to elaborate on Nikitas response: I don't think a mb_str_contains is necessary, because the proposed function does not behave differently, if the input strings are multibyte strings. When searched for a multibyte string in another multibyte string, the return value would consistently be true/false. The position/offset at which the multibyte string was found is not relevant. The reason for the existence of a strpos/mb_strpos is the fact, that the returned position/offset varies depending on weather or not the string is a multibyte string or not. The only possible valid variants concerning multibyte and incasesensitivity I see are: * str_contains: works as expected with multibyte and non multibyte strings. * mb_str_icontains: is the only valid option to do a incasesensitive search for multibyte strings. Unneeded variants I see are: * mb_str_contains: does not behave differently when compared to str_contains, as mentioned above. * str_icontains: is a possible option but could be error prone for when used with multibyte strings like UTF-8, as it is de facto the standard nowadays. I'm certain there would be confusion among php developers when the newly proposed functions are only str_contains and mb_str_icontains. Patrick ALLAERT: Yes, it does have one: people having already defined a str_contains() function in the global scope will have a PHP Fatal error: Cannot redeclare str_contains() You are absolutely correct with this. Although functions added by frameworks to the global scope are usually guarded by: if (!function_exists('str_contains')) {}
March 2, 2020 19:09 (Andrea Faulds)

Philipp Tanlak wrote:
> I like to elaborate on Nikitas response: I don't think a mb_str_contains is > necessary, because the proposed function does not behave differently, if > the input strings are multibyte strings.
This is not true for all character encodings. For UTF-8 it is correct, but consider for example the Japanese encoding Shift_JIS, where the second byte of a multi-byte character can be a valid first byte of a single-byte character. str_contains() would have incorrect behaviour for this case. Regards, Andrea Faulds