Re: [PHP-DEV] Are PECL modules preferable?

This is only part of a thread. view whole thread
  109211
March 22, 2020 20:36 mike@newclarity.net (Mike Schinkel)
> On Mar 21, 2020, at 7:15 PM, Ben Ramsey <ben@benramsey.com> wrote: > >> On Mar 21, 2020, at 17:52, Mike Schinkel <mike@newclarity.net> wrote: >> >>> On Mar 21, 2020, at 5:59 PM, tyson andre <tysonandre775@hotmail.com> wrote: >>> FROM: Re: [PHP-DEV] [RFC] is_literal() >>> >>> And if it can be implemented as a PECL module, that would be more preferable to me than a core module of php. >>> If it was in core, having to support that feature may limit optimizations or implementation changes that could be done in the future. >> >> Just wanted to address this comment which was made on another thread (I did not want to hijack that thread.) >> >> A large number of PHP users have no control over the platform they run on, so the option to use PECL modules is a non-starter for them. >> >> Here are several of those managed hosting platforms I speak of. Collectively they host a large number of WordPress sites, and Pantheon also host Drupal sites: >> >> https://pagely.com/ >> https://wpvip.com/ >> https://wpengine.com/ >> https://kinsta.com/ >> https://pantheon.io/ >> >> Given that, if there is an option between a useful feature being added to core or left in PECL, I would vote 100% of the time for core, since working with WordPress on a corporate site I can rarely ever use PECL extensions. >> >> #fwiw >> >> -Mike > > > If at all possible, I advocate for implementing in userland.
I disagree with this in many cases; more on that below.
> Of course, the specific is_literal/taint feature is special in this regard -- it can’t be implemented in userland.
Well, I broke off this thread on its own to decouple from the is_literal/taint feature. Although I agree it is not a userland option.
> IMO, PECL is an antiquated system that needs a successor, in much the same way Composer is the successor to PEAR. I think there are folks working on a solution for this, but I’m not sure where they are in their efforts. If we could make extensions as easy to package, distribute, and install (and load without root privileges) as Composer packages are, then I think we could say that PECL extensions are preferable.
Totally agree with all of that.
> Maybe FFI can help in this regard?
I think that is not a viable solution because FFI can run unsafe code and can be disabled in PHP.ini. Given the former most (all?) managed hosts will disable FFI because — as PHP.net says on https://www.php.net/manual/en/intro.ffi.php: "Caution: FFI is dangerous, since it allows to interface with the system on a very low level. The FFI extension should only be used by developers having a working knowledge of C and the used C APIs. To minimize the risk, the FFI API usage may be restricted with the ffi.enable php.ini directive." I would love to see forward movement on this. I think what is needed for a binary extension that could be loaded in userland is some language or tool that can generate guaranteed-safe extensions, and one that will provide a significant performance gain. So what are the potential options? These are the ones that come to mind: 1. Create a safe language similar to PHP but that uses LLVM to compile down to a binary form that could be loaded in userland. There are numerous existing languages currently in development that might be roped in to becoming this language. Jai that Robert Hickman mentioned comes to mind: https://inductive.no/jai/ 2. Create a safe language similar to PHP but that compiles down to C source code that can then be compiled to create a binary form that could be loaded in userland. Zephir comes to mind as a language that might be co-opted for this (zephir-lang.com) , and you Ben were the one to mention this to me. 3. Explore Rust to see if there are ways to leverage its safety and limit it to only safe features compiled to create a binary form that can be loaded in userland (I have no clue if this is even possible.) 4. Explore GoLang to see how to leverage it to create a binary form that can be loaded in userland. 5. Look for another language that can be co-opted to use for creating safe binaries. Maybe a less well-known language where the language's team would be motivated to implement the specific features PHP would need to make this a reality (Julia?) 6. Embrace Web Assembly (WASM) as the extensibility mechanism for PHP. If we do then people can use many compiled languages to create WASM binary files such as C, C++, D, Rust, GoLang, Julia, etc. And there is already an extension to handle WASM in PHP: https://github.com/wasmerio/php-ext-wasm 7. And finally, I am sure there are other solutions that did not come to mind that someone else has considered? BTW, I don't think we can offer a "safe" extensibility method that has the ability to manipulate PHP on the low level, unless of course PHP offered safe lower-level APIs. So is_literal()/is_taint() might still be off the table here for one of these extension mechanisms. But I could be wrong here and if so I hope someone will explain. ------ That said, of all the options that came to mind using WASM as a replacement for PECL seems like it would be the best solution, because: 1. WASM is WC3 recommendation 2. WASM is designed to be safe: https://webassembly.org/docs/security/ 3. WASM has a package manager (wapm.io) so there are/will be a lot of existing WASM packages for use in PHP 4. WASM compiles to a binary, so could be uploaded to a server and loaded as a binary by PHP 5. As noted above, many languages can create WASM: github.com/appcypher/awesome-wasm-langs As for performance, WASM is probably faster than PHP, but we'll need benchmarks to know how much faster. So what do others think? Should we consider adding WASM support to PHP? Maybe this would be a good example of implementing a proof-of-concept container to let people try it out and see what they think?
> In the meantime, I agree with you that general-use language features that cannot be implemented in userland can serve the community best in the core, rather than in PECL, but their general utility will need to be weighed against their impact to the engine (i.e., if a feature slows down the engine, we can’t put it into core).
Agreed. If the feature slows down the engine then yes, it is problematic to put into core. Of course we should be careful to guard against a small subset of people against a useful feature using a naive implementation that slow down core as an argument against the feature when it is possible that an intelligent implementation can exist w/o affecting performance. ------ And finally, as promised above, I disagree that everything should be pushed to userland *whenever-possible.* I believe there is a great benefit to standardization of functionality. The benefits of said standardization include: 1. Avoids reinventing the wheel. While developers generally adopt core language features they re-invent the wheel in userland. Lack of awareness of existing, non-invented-here syndrome, dislike of the API, company dictates only to use approved external code, etc. This reinventing causes fragmentation and can result in a massive duplication of error. 2. Minimizes training/learning required. A developer learns the core feature and they are done. In userland they will have to be trained on or learn the new feature every time they use a different codebase or package that implements differently. 3. Allows commonality in articles, tutorials and documentation. If a standard feature exists then it can be used in any article, tutorial or documentation without necessarily having to show and explain a userland implementation. This helps to level-up the skill of the average PHP developer simply because more and better learning material will naturally become available. 4. Minimizes dependency hell and/or duplicated code to maintain. If a feature is in core then any PHP package can use it w/o having to bring in yet another dependency and/or maintain yet another userland implementation. 5. Empowers future developers to solve greater problems. Standardized functionality empowers future developers to solve new problems rather than continue to recreate solutions and/or manage the dependencies where others solved them. Think of the biblical Tower of Babel and how it stalled progress once everyone was speaking a different language. 6. The nature of programming languages. Programming languages are tools that embrace and simplify well known patterns so we can solve harder and harder problems. If everyone in the past had the opinion that code should be implemented in userland if-at-all-possible then we would all be coding everything in assembly language, still. So when we identify a common pattern I believe is should be moved out of userland and into the core. In closing, I agree that 80+% of functionality that is used < 20% of the time can and should stay in userland. But for everything else — the ~20% of functionality that is used 80+% of the time, such as str_contains() — that should be moved out of userland and into PHP core. IMO. #jmtcw #fwiw
> > Cheers, > Ben >
-Mike