At the time of typing this and with only four days left, the scalar types RFC is barely being rejected by 66:34 (needs two thirds to pass). Some of the votes against are from folks that don’t want scalar type hints at all. I also think some people may be voting in favor on the basis of expecting the type hinting implementation to solve problems it’s not really meant to — type hinting is not validation (the “word” is not the “meaning”), and content-safety is only a side effect of type-safety.
I’ve been following the internals discussion, the threads on Reddit, and asking some questions around.
I think the proposal is clever and having scalar types would be great. You can go to Antony Ferrara’s blog to read about a lot of good reasons in favor of this RFC.
I still feel that in its current form (as far as I understand it) the proposal can add more effort and introduce more issues than expected, so I decided to do a write up my concerns.
1. Autocasting on weak mode
The proposal states that when a hinted function is called in weak mode, the arguments will be casted. The intent is that, if you were using functions and having them work without minding types, you’d still be able to do so even after adding hints.
In practice, this still means the behavior of the hinted function changes. Without hints, the function can still inspect the argument as-is.
- If you pass
"2"to an unhinted function that will use it as an integer, the function still has the change of calling
gettypeand see that it was given a string.
- If you pass
"2cats", the unhinted function can see the
- If you hint the argument as
intand call the function from a weak types context, the function cannot see the
"cats", and the caller will see a Notice.
Now, normally you wouldn’t hint an integer if any of the functionality depended on reading string characters from that argument. But I know there’s going to be a temptation to replace validation checks with a hint, which guarantees the function will run, and then someone else using that function and relying on it stopping bad input will find themselves having to change their code.
- Case #1: The function detected the invalid argument and returned
null, which was then detected by the caller. After adding a hint, the function always successes. It sometimes throws a Notice for lossy casts, but the caller had no code in place to catch it.
- Case #2: The function detected the invalid argument and threw an Exception, which was then catched by the caller in a
try... catchblock. After adding a hint, the function throws a Notice, which is not catched by
Andrea Faulds has suggested adding a “number” type hint. If it casts only valid strings to numeric types (like
is_numeric) and throws an Exception that can be catched by
try... catch that would help. But adding a hint will still count as a code breaking change due to potential loss of precision.
Personally, I think I would prefer no autocasting to happen. Calling a hinted function in a weak context would completely ignore the hints. This means the callee’s author still has to write code to validate the arguments as if the hint wasn’t there, but this is OK since content-safety is not type-safety.
This doesn’t decrease the essential value of scalar type hints, which is to allow static analysis and optimizations; you can’t quite ask these tools to understand hand-coded type checks, so the hints are there for them. On a strict context, on the other hand, the hint would have the side-effect of validation, and render the check in the callee’s side redundant, and this is OK too.
The reason you may have redundant checks is we are allowing hints to be bypassed. In the current RFC, the weak caller will have to add a check before passing an argument to a hinted function, and that check is redundant if the mode is changed to strict. But I feel that check needs to be on the callee’s side and ship with it, because all weak callers are going to need it. Yeah, strict callers will still find themselves handling their arguments before passing, and then the callee still doing an unnecessary check, but unless we can guarantee both caller and callee will always be the same library, that’s not so strange or undesirable.
And there’s also the consideration that library coders are going to be typically more skilled than library users, so passing all the responsibility to the caller is not ideal.
2. Lossy casts throwing notices
This is a simple concern. If autocasting were to remain in, I think lossy casts should throw fatal errors catchable with
try... catch, to prevent safety issues remaining unhandled during the transition to PHP 7.
3. Strict mode declaration
Currently strict mode would be set on a per-file/per-block basis with
declare statements. This is very likely to change later.
This would be less of an issue without autocasting, because the functions would need to be resilient anyway to context switching. With autocasting in, adding a
declare is a potentially breaking change. In that case, I would very much prefer that weak calls were only enabled by something akin to a
try_with_casts... catch block, which both allowed to catch lossy casts and made the caller aware that they are responsible now of the argument type casting.
I understand the position that passing this RFC could be the chance to get strict types in before PHP 7, and that it could just be refined later. But I fear parts of it won’t be seen as changeable once code has been written to fit it; people will look at the hinted functions and say “we can’t switch content-safety back to the caller-side, think of all the libraries that replaced validation for hints”. So even though the RFC is very well thought (otherwise it wouldn’t have the support of folks like Anthony Ferrara or Phil Sturgeon), maybe it would be better to risk PHP 7.0 not having scalar hints.
If it passes, component/library writers could still adopt as a good practice to always expose only APIs with unhinted scalar arguments and handle casting themselves. If the scalar types behavior was to change later it will be easier to fix.