Skip to content

Make @computed_field (optionally) initializable #9131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 of 13 tasks
JGobeil opened this issue Mar 28, 2024 · 10 comments
Open
3 of 13 tasks

Make @computed_field (optionally) initializable #9131

JGobeil opened this issue Mar 28, 2024 · 10 comments
Assignees

Comments

@JGobeil
Copy link

JGobeil commented Mar 28, 2024

Initial Checks

  • I have searched Google & GitHub for similar requests and couldn't find anything
  • I have read and followed the docs and still think this feature is missing

Description

Dumping an object containing a @computed_field will compute this property and include it in the dict/json. Great!
However, using this dict/json to initialize an object will not initialize the field, and the (potentially long) computation must be done again.

Would it be possible to add this feature?

Adding an option like @computed_field(..., init_var: bool = False,...) would allow this behaviour without breaking any code.

Thanks

Affected Components

@JGobeil JGobeil changed the title Make (optional) computed_field initialisable Make @computed_field (optionally) initializable Mar 28, 2024
@sydney-runkle
Copy link
Contributor

Seems like a reasonable feature request! PRs welcome!

@andresliszt
Copy link
Contributor

hello! is anyone working on this? I'd like to help here, it's a nice feature

@sydney-runkle
Copy link
Contributor

@andresliszt,

Go for it!

@andresliszt
Copy link
Contributor

andresliszt commented Apr 10, 2024

@sydney-runkle In this new feature, what's expected behavior when initializing a computed property

from pydantic import BaseModel, computed_property

class Model(BaseModel):
    @computed_property(init_var = True)
    @property
    def foo(self) -> str:
        return "bar"
  1. Include the property in the validator, e.g Model(foo=1) should raise an error
  2. Warn about it
  3. Ignore it

The existing code issues a warning on serialization

class Model(BaseModel):
    @computed_property
    @property
    def foo(self) -> str:
        return 1
  
>>>  Model().foo
# returns 1 without any warn

>>> Model().model_dump()
# returns the dictionary and issues a warning `Expected `str` but got `int` - serialized value may not be as expected`
        

@Viicos
Copy link
Member

Viicos commented Apr 15, 2024

I'd like to raise some concerns about this feature request. If I understand correctly, OP wants something like this:

from pydantic import BaseModel, computed_field

class Circle(BaseModel):
    radius: int

    @computed_field
    @property
    def diameter(self) -> int:
        return self.radius * 2

c = Circle(radius=2, diameter=4)

This would allow invalid state to be loaded: c = Circle(radius=2, diameter=5), and it is unclear what should happen when dumping an instance:

c = Circle(radius=2, diameter=4)
c.model_dump()  # According to the feature request, `diameter` should use the initialized value

c.radius = 4

c.model_dump()  # What should happen? Should `diameter` be computed again?

Imo, it doesn't really make sense to "initialize" a computed_field (which is just a property), as the value can change based on the other model's values.

However, it would make sense to have something similar for cached_property. Pydantic supports the combination of cached_property and computed_field, so you could write a helper constructor to set initial values by hooking into the instance's __dict__ (as cached_property is doing internally).

@JGobeil
Copy link
Author

JGobeil commented Apr 18, 2024

@Viicos, I agree. My use case is for computed_field + cached_property.

The goal is in a serialization/deserialization process where the (potentially long) recomputation of computed_field will gain to be skipped during deserialization. In that sense, the feature is only really useful/makes sense when it is a computed_field + cached_property., as a property shouldn't/can't be set anyway. Maybe a dedicated decorator would be better (lazy_field?).

I agree that it could lead to "invalid" objects as the computed_field should be trusted as correct without redoing the computation. Still, as long as the behaviour is documented correctly, this should not be an issue.

As far as I understand, the only difference between a computed_field + cached_property vs a simple cached_property is that the field will be computed and saved in the model dump. However, this dump can not recreate the object in its previous state without redoing the computation, mostly defeating the purpose of including it in the dump. Allowing it to be a two-way process may open many possibilities.

@sydney-runkle
Copy link
Contributor

@Viicos,

Thanks for your concerns - great points.

Given that, what do you think makes sense re #9131 (comment)?

@Viicos
Copy link
Member

Viicos commented Apr 29, 2024

I'll look into it to see if a simple helper/dedicated decorator snippet can be used for this specific use case, so that it can be used downstream. As this would not be specific to Pydantic (but mainly dataclasses and dataclasses-like implementations), I'm wondering if it should live in Pydantic itself. I'll get back to this issue once I figure out a nice implementation :)

@Viicos
Copy link
Member

Viicos commented Apr 29, 2024

Taking a deeper look at it: having a separate helper isn't really feasible, as the models can be initialized in many ways: Model(...), Model.model_validate(...), etc. And it always goes to the pydantic-core side to actually set the attributes on the instance. One workaround, assuming extra is set to "allow":

class A(BaseModel):

    def model_post_init(self, context: Any) -> None:
        assert self.model_extra is not None  # Guaranteed thanks to extra="allow"
        self.__dict__["test"] = self.model_extra.pop("test")

    @computed_field
    @cached_property
    def test(self) -> int:
        return 1

    model_config = {
        "extra": "allow"
    }

a = A.model_validate({"test": 2})

print(a.test)

Which isn't ideal, as it won't work with other extra values. Moreover, trying to make this generic is tricky and kind of touches internals of Pydantic:

from typing import TypeVar

BaseModelT = TypeVar("BaseModelT", bound=BaseModel)


def init_cached(cls: type[BaseModelT], /) -> type[BaseModelT]:

    if not cls.model_config.get("extra") == "allow":
        raise ValueError("'init_cached' only works with models with `extra` set to `\"allow\"`.")

    cached_computed_fields = [
        name
        for name in cls.model_computed_fields.keys()
        if isinstance(getattr(cls, name), cached_property)
    ]

    original_model_post_init = cls.model_post_init

    def model_post_init(self: BaseModelT, context: Any) -> None:
        assert self.model_extra is not None  # Guaranteed thanks to extra="allow"

        for prop_name in cached_computed_fields:
            init_val = self.model_extra.pop(prop_name, PydanticUndefined)
            if init_val is not PydanticUndefined:
                self.__dict__[prop_name] = init_val

        original_model_post_init(self, context)

    cls.model_post_init = model_post_init
    cls.__pydantic_post_init__ = "model_post_init"
    cls.model_rebuild(force=True)

@init_cached
class A(BaseModel):

    @computed_field
    @cached_property
    def test(self) -> int:
        return 1

    model_config = {
        "extra": "allow"
    }

a = A.model_validate({"test": 2})

print(a.test)
#> 2

So I think the feature request is valid, as we just saw the workaround is a bit fragile and does not cover all uses of extra.

However, supporting this in Pydantic (and inevitably pydantic-core) might require some work, and it raises a couple questions (should this only apply to cached_property -- what about lru_cache + property? Should we validate the type as mentioned by @andresliszt?)

@ernieIzde8ski
Copy link

I'd like to raise some concerns about this feature request. If I understand correctly, OP wants something like this:

from pydantic import BaseModel, computed_field

class Circle(BaseModel):
    radius: int

    @computed_field
    @property
    def diameter(self) -> int:
        return self.radius * 2

c = Circle(radius=2, diameter=4)

This would allow invalid state to be loaded: c = Circle(radius=2, diameter=5), and it is unclear what should happen when dumping an instance:

That does look like invalid state. But I have a use-case that resembles this form, and that use-case is fully intended:

Python Snippet

# Converting a legacy Python 3.7 & 3.8 package into
# a more correctly typed & sanitized class using Pydantic.
# (Class simplified for demonstration)
class Note:
    def __init__(self, fields: list[str] = None, guid: str | None = None) -> None:
        self.fields = fields or []
        self.guid = guid

    @property
    def guid(self) -> str:
        return self._guid or guid_for(self.fields[0])

    @guid.setter
    def guid(self, val: str | None) -> None:
        # this usage of property.setter is a strange abuse of dynamic typing
        # and yet, for the end user, it's great
        self._guid = val

def guid_for(*args: object) -> str:
  """Rudimentary "hash" function."""
  joined = "".join(str(obj) for obj in args)
  return hex(hash(joined))

assert Note("foobar").guid == "foobar"
assert Note().guid not in (None, "", "foobar")

To implement this as a Pydantic model, I'm currently doing the following:

Python Snippet

class NoteModel(BaseModel):
  model_config = ConfigDict(extra="forbid")

  fields: list[str]
  guid_override: str | None = Field(default=None)

  @property
  def guid(self) -> str:
    if ininstance(self.guid_override, str):
      return self.guid_override
    return guid_for(self.fields[0])


assert NoteModel(fields=["abc"], guid_override="foobar").guid == "foobar"
assert NoteModel(fields=["abc"]).guid not in [None, "", "foobar"]

This works, but it's not quite what I hope to achieve:

  1. The NoteModel class should remain editable after initialization
  2. It should be possible to set all fields within the constructor.
  3. The property which determines guid should:
    1. Return an override, if explicitly set. The override can & will be completely different from what the class would have computed.
    2. Compute guid as a hash based off fields contained by Self.
  4. It should be possible to mask the extra variables from the end user, hiding them as a private attribute or similar.
  5. It goes without saying, but hopefully I'd be able to make the type checker happy.

This makes available choices slightly more interesting in several ways:

  • [1, 3-2] NoteModel cannot be frozen. This means using @cached_property on guid is unstable.
  • [2] Since @computed_fields cannot be initialized, an extra field (guid_override) is necessary.
  • [3-1] @cached_property is not the answer, because the goal is to have an explicitly set default, rather than a computed default.
  • [4] Having only one guid value in the constructor is difficult:
    • with only one guid constructor:
      • guid cannot have unannotated type str | None if the goal is to get str (excluding None) out of the assignment
      • guid cannot have type Annotated[str, BeforeValidator], because context for the rest of the NoteModel is needed to call the property.
      • guid cannot have type Annotated[str | None, AfterValidator], because the return type is already configured as str | None rather than as str.
      • guid cannot be constructed in any validator, because NoteModel is not frozen, and therefore the result of property(guid) is subject to change.
    • as a workaround, guid_override: str | None and guid: property[str] are defined separately
  • [5] Actually, this one is pretty easy. Pydantic is awesome.

If init_var=True were implemented, and, assuming it would respect property.setter (though I understand that wasn't in previous discussions), I imagine that initial Note class could be implemented nearly as-is, and without type violations.

Python Snippet

class NoteModel(BaseModel):
  model_config = ConfigDict(extra="forbid")

  fields: list[str]
  _guid: str | None = PrivateAttr(default=None)

  # @computed_field(init_var=True)
  @property
  def guid(self) -> str:
    return self._guid or guid_for(self.fields[0])

  @guid.setter
  def guid(self, val: str) -> None:
    self._guid = val


model = NoteModel(fields=["abc"])

first_guid = model.guid
assert first_guid not in [None, "", "foobar"]

# Fields shouldn't have mutated yet, this should be safe
second_guid = model.guid
assert first_guid == second_guid

# Fields have mutated, this should have a different value
model.fields[0] = "def"
third_guid = model.guid
assert first_guid != third_guid

# the guid itself has been set to a LiteralString
model.guid = "ghi"
fourth_guid = model.guid
assert fourth_guid == "ghi"

Compared to the first iteration of NoteModel, this class achieves all of my goals:

  • It is simpler and easier to mentally comprehend.
  • It is, without complication, extremely mutable.
  • The same name ("guid") is used to identify the property, both as an argument to the constructor and as an argument to the constructory.

It'll only save me a couple lines, but all the same, init_var would be great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy