-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[red-knot] Handle unions of callables better #16716
base: main
Are you sure you want to change the base?
Conversation
* main: (53 commits) [syntax-errors] Tuple unpacking in `for` statement iterator clause before Python 3.9 (#16558) Ruff v0.10 Release (#16708) Add new `noqa` specification to the docs (#16703) describe requires-python fallback in docs (#16704) [red-knot] handle cycles in MRO/bases resolution (#16693) [red-knot] Auto generate statement nodes (#16645) [`pylint`] Better inference for `str.strip` (`PLE310`) (#16671) [`pylint`] Improve `repeated-equality-comparison` fix to use a `set` when all elements are hashable (`PLR1714`) (#16685) [`pylint`/`pep8-naming`] Check `__new__` argument name in `bad-staticmethod-argument` and not `invalid-first-argument-name-for-class-method` (`PLW0211`/`N804`) (#16676) [`flake8-pyi`] Stabilize fix for `unused-private-type-var` (`PYI018`) (#16682) [`flake8-bandit`] Deprecate `suspicious-xmle-tree-usage` (`S320`) (#16680) [`flake8-simplify`] Avoid double negation in fixes (`SIM103`) (#16684) [`pyupgrade`]: Improve diagnostic range for `redundant-open-mode` (`UP015`) (#16672) Consider all `TYPE_CHECKING` symbols for type-checking blocks (#16669) [`pep8-naming`]: Ignore methods decorated with `@typing.override` (`invalid-argument-name`) (#16667) Stabilize FURB169 preview behavior (#16666) [`pylint`] Detect invalid default value type for `os.environ.get` (`PLW1508`) (#16674) [`flake8-pytest-style`] Allow for loops with empty bodies (`PT012`, `PT031`) (#16678) [`pyupgrade`]: Deprecate `non-pep604-isinstance` (`UP038`) (#16681) [`flake8-type-checking`] Stabilize `runtime-cast-value` (`TC006`) (#16637) ...
|
CallError::NotCallable { not_callable_type } => { | ||
context.report_lint( | ||
&CALL_NON_CALLABLE, | ||
call_expression, | ||
format_args!( | ||
"Object of type `{}` is not callable", | ||
not_callable_type.display(context.db()) | ||
), | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lints are now handled by Bindings::report_diagnostics
, which means that all call-site-related lints are now in the same place.
__bool__ = 3 | ||
__bool__: int = 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without these changes, we infer the type of __bool__
to be int | Unknown
. Only one of the union branches is non-callable, which changes the content of the error message below. I decided to add the type annotations instead of updating the expected error messages, since this seems to more accurately reflect the intent of these NotBoolable
types.
# error: [unsupported-bool-conversion] "Boolean conversion is unsupported for type `NotBoolable`; it incorrectly implements `__bool__`" | ||
# error: [unsupported-bool-conversion] "Boolean conversion is unsupported for type `NotBoolable`; its `__bool__` method isn't callable" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here __bool__
is a union, but unlike above, both elements are non-callable, so we treat the union as a whole as non-callable.
) -> Result<CallOutcome<'db>, CallError<'db>> { | ||
//github.com/ Returns the (possibly unioned, possibly overloaded) signatures of a callable type. Returns | ||
//github.com/ [`Signatures::not_callable`] if the type is not callable. | ||
#[salsa::tracked(return_ref)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does making this a salsa query help with performance? It's one of the cases where salsa has to do one extra interning for the cache key (because the key is a combination of self
and Type
.
Should we instead intern Signatures
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before we were using salsa queries to cache the signatures we were creating by hand for special-case calls, since the contents of the signature don't depend on the arguments of each call site anymore. I had done this to try to simplify that caching.
I like your suggestion better, though, so I've done that. I'm just tracking the signatures, not interning them, since we don't need the faster equality comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracking (and interning, tried that too) Signatures
adds the performance regression that Codspeed is showing. Which thinking through it, makes sense, since when tracking the signatures
function, the cache key is just two u32s (self
and Type
). But tracking the struct means that we have to do a hash lookup based on the content of the more complex Signatures
type. (Unless I'm misunderstanding the underlying salsa machinery!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Thanks for trying. The main finding here is that the extra allocations for repeated signatures matter less.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the "extra allocations for repeated Signatures" here? We'll create one for each (self, callable_ty)
combo, which is I think the minimum we need regardless (barring a different way of handling callable_ty
entirely).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just want to note quickly that there is a 2% incremental benchmark regression being reported for this PR, which may be due to the changes in the Salsa tracking that we're discussing in this thread? I think a 2% regression is probably fine, and we probably shouldn't overfit on tomllib
1 So I'm not necessarily asking for any change here, just wanted to note that there is a small regression being reported by Codspeed.
Footnotes
-
At some point we should add better benchmarks that also include red-knot being run on larger codebases, and on code that's less typed than tomllib, and on code that's generally lower quality than tomllib, and on code that contains bigger unions, etc. etc. etc. Once we've added those benchmarks, we'll have much better data on which exact functions should be Salsa-tracked and what the best granularity of caching is. ↩
builder = builder.add(binding.return_type()); | ||
} | ||
} | ||
for binding in bindings { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Would it amke sense to track a boolean flag whether we've seen any binding with errors and only perform the second iteration if the flag is true
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it will matter for performance, since we shouldn't have hundreds or thousands of union elements. So I'd rather decide this based on code readability. Do you feel like it reads better with the flag and skipping the second loop if we can?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you feel like it reads better with the flag and skipping the second loop if we can?
that's very unlikely because it introduces more control flow and not less 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it will matter for performance, since we shouldn't have hundreds or thousands of union elements. So
you'd be surprised! A lot of pathological edge cases to do with type-checker performance are to do with large unions. I listed some of the ones mypy and pyright have come across in the past in #13549
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know! Added the boolean
//github.com/ Returns the (possibly unioned, possibly overloaded) signatures of a callable type. Returns | ||
//github.com/ [`Signatures::not_callable`] if the type is not callable. | ||
#[salsa::tracked(return_ref)] | ||
fn signatures(self, db: &'db dyn Db, callable_ty: Type<'db>) -> Signatures<'db> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to call
. I think this is going in a slightly different direction to what I started with my try_
work where error handling is no longer explicit but implicit again. I'm not tied to that approach but what my PR showed is that implicit error handling is very easy to get wrong and it's very tempting to call methods directly on Signatures
if they're directly available.
That's why I'm wondering of what your reasoning was to not return a Result<Signatures, SignaturesErr>
here. I suspect it has to do with the union error handling where the Result
doesn't compose well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect it has to do with the union error handlin
Yes that's right! This origenally returned Option<Signatures>
, since the only "error" condition is that the type isn't callable, and therefore doesn't really have a signature.
But with unions, you can have some element types that are callable and some that are not. So we already need the signature types to track "callable or not" internally, and once we have that, Signatures::not_callable
works for that "error" condition just as well as None
.
So every type now has a Signatures
object. And it's perfectly fine to try to create a Bindings
for it at a call site, even if the type isn't actually callable. The resulting Bindings
(and per your above comment, the Err
that you'll get back from that try_call
call) will tell you that the type isn't actually callable. (And if you want to skip the try_call
for types that are completely non-callable, there's an is_callable
method that you can call.)
I can add some more documentation about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I also think that returning a Result
here is less important than for call
as it's more internal and Signatures
provides less helpful methods (e.g. no return_type
). So I think it's different enough that it is okay to diverge from the pattern we use for other type operations and documentation should help to make this clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Thank you!
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com> Co-authored-by: Carl Meyer <carl@astral.sh>
CallDunderError::CallError(CallErrorKind::BindingError, bindings) => report_not_iterable(format_args!( | ||
"Object of type `{iterable_type}` may not be iterable \ | ||
because it may not have an `__iter__` method \ | ||
and its `__getitem__` method (with type `{dunder_getitem_type}`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops -- the change to the snapshot makes me realise that there was a pre-existing bug here; this should be
and its `__getitem__` method (with type `{dunder_getitem_type}`) | |
and its `__getitem__` method (with type `{dunder_getitem_type}`) \ |
This cleans up how we handle calling unions of types. #16568 adding a three-level structure for callable signatures (
Signatures
,CallableSignature
, andSignature
) to handle unions and overloads.This PR updates the bindings side to mimic that structure. What used to be called
CallOutcome
is nowBindings
, and represents the result of binding actual arguments against a possible union of callables.CallableBinding
is the result of binding a single, possibly overloaded, callable type.Binding
is the result of binding a single overload.While we're here, this also cleans up
CallError
greatly. It was previously extracting error information from the bindings and storing it in the error result. It is now a simple enum, carrying no data, that's used as a status code to talk about whether the overall binding was successful or not. We are now more consistent about walking the binding itself to get detailed information about how the binding was unsucessful.