Skip to content

feat: add BaseDocWoId #1803

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Sep 26, 2023
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 38 additions & 31 deletions docarray/base_doc/doc.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,39 +61,12 @@
ExcludeType = Optional[Union['AbstractSetIntStr', 'MappingIntStrAny']]


class BaseDoc(BaseModel, IOMixin, UpdateMixin, BaseNode):
class BaseDocWithoutId(BaseModel, IOMixin, UpdateMixin, BaseNode):
"""
BaseDoc is the base class for all Documents. This class should be subclassed
to create new Document types with a specific schema.

The schema of a Document is defined by the fields of the class.

Example:
```python
from docarray import BaseDoc
from docarray.typing import NdArray, ImageUrl
import numpy as np


class MyDoc(BaseDoc):
embedding: NdArray[512]
image: ImageUrl


doc = MyDoc(embedding=np.zeros(512), image='https://example.com/image.jpg')
```


BaseDoc is a subclass of [pydantic.BaseModel](
https://docs.pydantic.dev/usage/models/) and can be used in a similar way.
BaseDocWoId is the class behind BaseDoc, it should not be used directly unless you know what you are doing.
It is basically a BaseDoc without the ID field
"""

id: Optional[ID] = Field(
description='The ID of the BaseDoc. This is useful for indexing in vector stores. If not set by user, it will automatically be assigned a random value',
default_factory=lambda: ID(os.urandom(16).hex()),
example=os.urandom(16).hex(),
)

if is_pydantic_v2:

class Config:
Expand Down Expand Up @@ -545,7 +518,7 @@ def parse_raw(
:param allow_pickle: allow pickle protocol
:return: a document
"""
return super(BaseDoc, cls).parse_raw(
return super(BaseDocWithoutId, cls).parse_raw(
b,
content_type=content_type,
encoding=encoding,
Expand Down Expand Up @@ -582,3 +555,37 @@ def _exclude_docarray(
)

to_json = BaseModel.model_dump_json if is_pydantic_v2 else json


class BaseDoc(BaseDocWithoutId):
"""
BaseDoc is the base class for all Documents. This class should be subclassed
to create new Document types with a specific schema.

The schema of a Document is defined by the fields of the class.

Example:
```python
from docarray import BaseDoc
from docarray.typing import NdArray, ImageUrl
import numpy as np


class MyDoc(BaseDoc):
embedding: NdArray[512]
image: ImageUrl


doc = MyDoc(embedding=np.zeros(512), image='https://example.com/image.jpg')
```


BaseDoc is a subclass of [pydantic.BaseModel](
https://docs.pydantic.dev/usage/models/) and can be used in a similar way.
"""

id: Optional[ID] = Field(
description='The ID of the BaseDoc. This is useful for indexing in vector stores. If not set by user, it will automatically be assigned a random value',
default_factory=lambda: ID(os.urandom(16).hex()),
example=os.urandom(16).hex(),
)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy