-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Validators *only* for assignment? - mimicking properties with setters #11787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This seems pretty niche to me, so I'm afraid this will introduce attractive nuisances if implemented as is. Could you expand on why you don't need validation to performed on instantiation? The goal of Pydantic is to make sure data is validated so that you can safely make use of it. You could also go a different way we a solution that might suit your use case better:
|
I don't have any estimates on how niche it is, but my initial guess would be somewhere around how niche properties with setters are? Am not sure why this would be an attractive nuisance. Since this would directly address these issues:
Those are possible options if I was not trying to model an existing format and the format was arbitrary. Part of the benefit of having the pydantic model being a direct reflection of the format is that serialization/deserialization is direct: I just dump the model and bencode it: this is directly in line with the spirit of making sure the data is validated so you can use it, just that validity is nonlocal and depends on the alignment of multiple fields. Hashing can take quite a long time and is stored in the torrent file when read, so cached property is not really a great fit. Plus would still need a trigger to invalidate it from another field changing, which circles back to this issue
I'm not sure how? Since the hash only depends on the files, any other fields won't affect this feature or the hash calculation. |
I'm not sure the issues you linked are showing exactly the same problem; for instance in #6597, I assume the author also want validation to run on instantiation.
But cached property is precisely meant for this right? Only computed once, and a proper invalidation mechanism makes it so that it is only recomputed when needed. More broadly I think properties still are a better fit for your use case. For instance, what if someone does
What I meant is that currently a file is represented as a class TorrentFile(BaseModel):
path: Path
some_useful_attribute: <...>
@property
def hash(self):
...
class Torrent(BaseModel):
files: list[TorrentFile] |
Related IssuesFair, not obvious how some of those issues are related, so here's a little more detailed breakdown of some prior issues that I think having this would help resolve. Per-field assignment validation, post-init modification of validation behaviorE.g. #10563 Problem: Want to be able to validate some fields on assignment but not others. Or, want to conditionally change assignment validation behavior. Relation: Straightforwardly, this would allow one to specify assignment validation behavior for specific fields, and would allow differentiating assignment validation from init validation. Recursion errors with `validate_assignment` and `mode='after'`E.g. #9576 Problem: having validate assignment on with an Relation: To mutate a value on assignment, using If instead of this: class User(BaseModel):
model_config = ConfigDict(validate_assignment = True)
name: str
other : Optional[str] = None
@model_validator(mode = 'after')
def validate_me(self):
self.other = 'abc'
return self we were able to do this def _set_other(self, value: Any, info: ValidationInfo) -> Self:
self.other = "abc"
return self
SetOther = Annotated[T, AfterAssignmentValidator(_set_other)]
class User(BaseModel):
name: SetOther[str]
other: str | None = None or, functional form, this class User(BaseModel):
name: str
other : Optional[str] = None
# this, or assignment_only=True, or whatever the syntax may be
@field_validator("name", mode = 'after', when_used="assignment")
def set_other(self, value: Any):
self.other = "abc"
return self then there is no need for enabling The other direction is suggested here, of having The other suggestion of disabling assignment validation during execution of the after validator also doesn't exactly meet the need either and seems like it would deepen the potential for footguns - one can only imagine the "why don't my validators run on assignment when it seems like they should!" issues. Even for intrinsically recursive validators that mutate the same value that's being assigned ti, If instead the validation-only behavior was separated from the general model validators then it would be possible to not trigger recursive assignment validators when they are called within assignment validators* without needing to disable them in a way that will be more difficult to communicate to users than providing an explicit So e.g. say I have a model like this that has one self-recursing validator to count the number of assignments and another assignment side effect model that emits an event based on another field value (e.g. say we are implementing a rate-limiter or a session limiter or something): class MyModel(BaseModel):
model_config = ConfigDict(validate_assignment=True)
n_assignments: int = -1 # ideally this should be 0, but can't do assignment-only validators
event_threshold: int = 10
event: Callable
value: str = "something we assign to a lot"
@model_validator(mode="after")
def increment(self) -> Self:
self.n_assignments += 1
@field_validator("n_assignments", mode="after")
def emit_event(self, value: int, info: ValidationInfo) -> Self:
if value > info.data['event_threshold']:
info.data['event']()
return 0
return value If we were to simply disable assignment validation during execution of the validator, then our Initialize `@computed_field`/`@cached_field`I.e. #9131 Problem: This issue is almost identical to what i'm asking for here - want to be able to both initialize a model with a precomputed value, but then also recompute it if it is not present. The problem is "what do we do when we dump the model, do we recompute it or keep the assigned value?" Relation: If it was possible to have assignment side effects on the other fields of the model that could invalidate the cached property, then the problems with that issue are resolved - One can explicitly annotate when the cached property should be invalidated. So from pydantic import BaseModel, computed_field
class Circle(BaseModel):
radius: int
unrelated_prop: str
@cached_property
def diameter(self) -> int:
return self.radius * 2
@field_validator("radius", mode="after", assignment_only=True)
def invalidate_diameter(self):
del self.diameter That handles the need to instantiate values (cached property can be assigned to), invalidate values (via assignment-only validators), and lazily compute values (if unset, compute using the method when dumped/accessed). That would be very awkward to do with Internals problems with assignment validation and stateThere are a handful of these issues, but e.g. #7105 , also #8474 Problem: Validating with assignment causes values to be updated even if they fail validation. According to the issue this is because the object has already been mutated, and there isn't a clear way to rollback the object state Relation: Taking a brief read of the impl, it seems like the issue with handling this comes from the mixed behavior of distinction from
|
What I am trying to express here is that Pydantic can't reasonably introduce solutions that will perfectly fit every user use case. I understand you need to have a 1:1 representation of some spec that might not be ideal, but Pydantic can't be held responsible for this. Instead of providing validators only for assignment, having a really limited scope, we need to find more generic ways on achieving such behavior. For instance, you could have your model defined as: from pydantic import BaseModel, SkipValidation
def update_hashes(instance, attribute, value):
instance.files = value
# Recompute hashes
instance.hashes = ...
class Torrent(BaseModel):
files: list[Path] = Field(on_setattr=update_hashes)
hashes: SkipValidation[list[bytes]] Inspired from Your alternative in |
Absolutely understood! Not trying to be ornery or saying "pydantic fix my special problem," only reason I raised the issue is because I think it is a more general need and would benefit other use cases, which is why I gathered the related issues and tried to describe how this would benefit them.
This is great! And would be exactly what I need. If I'm reading the docs right, what I pitched above is basically a pydantic flavor of
Also great! If this is preferable I can make a fuller sketch. I'll go through and find all the other kinds of contexts where a validator can be called so it's a more general solution than just detecting assignment, e.g. being able to differentiate between direct instantiation vs. Instantiation nested within a model would also probably be helpful. Will need a little bit of scoping input on that - don't want to make it just an edge case feature that is incomplete, but also don't want to blow it up into a huge change. |
Seems like it, and I think we'll explore more |
Initial Checks
Description
Currently:
ConfigDict(validate_assignment = True)
. This is very useful!@computed_field
value with a setter to mutate fields other than the computed field (since it does not, by definition, have its "own" value) on assignment.It is not possible, as far as I know, to have a validator that runs only on assignment. This seems like a missing feature that makes it tricky to make models that have assignment side effects that should not apply during model instantiation - or more generally, to have models with fields whose behavior mimicks
@property.setter
s.Motivation
As a motivating example, say, hypothetically, we are trying to model a .torrent file. A torrent contains some list of files as well as a set of hashes that correspond to those files. When the file list is changed, we should invalidate any hashes that are set so that we know they need to be recomputed (or recompute on assignment, either/or, equivalent for this example). We should not do that when the model is instantiated, because those hashes are correct at that time!
Assignment-only validators might want to make use of the other fields on the model, so ideally they would not need to be defined as
@classmethod
s as validators need to be.Assignment validation using model validators leads to recursion for obvious reasons (e.g. #6597 ), and a specific validator that was aware of the context of assignment would be able to avoid them by only triggering once per actual assignment.
Syntax Example
I figure these would just be a wrapper around the other validator methods that mark them as not being called on model creation, and always create the appropriate
__setattr__
code even ifvalidate_assignment = False
(to allow for using them without running validation on every field).The annotated validators might look like this:
ValidationInfo
to keep with the optional dependency injection style, but it could just as easily be mandatory or use a regular python type annotation.self
inside of ofValidationInfo
or similarThe functional form might look like this
I'm not sure if this is valid, but another obvious option would be doing this:
anyway let me know what ya think, as always, i would be more than happy to draft this.
but that seems worse: we could rig it so multiple were allowed unlike normal property setters, but them looking so similar would probably be confusing.
Affected Components
.model_dump()
and.model_dump_json()
model_construct()
, pickling, private attributes, ORM modeThe text was updated successfully, but these errors were encountered: