Skip to content

feat: index or collection name will default to doc-type name #1486

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
May 6, 2023

Conversation

JoanFM
Copy link
Member

@JoanFM JoanFM commented May 4, 2023

No description provided.

Signed-off-by: Joan Fontanals Martinez <joan.martinez@jina.ai>
@JoanFM JoanFM force-pushed the feat-default-name-index branch from 96469c2 to 20d648a Compare May 4, 2023 14:04
Copy link
Member

@JohannesMessner JohannesMessner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a breaking change I think we need to be careful to guide people to un-break their code.

Concretely what will happen is that collections/tables created with a docarray version before this will be called Document, and when a user tries to access their database with a new version of docarray it will assume a table called MyDoc or something like that, and it will fail.

I think for this case we should catch the error and add a message like:

You are trying to connect to table {whatever}, but no such table exists. If you are trying to access a table that was created with DocArray version {one smaller than this one} this could be because the default table name has changed between versions. In that case, try adding the following QdrantDocIndex(..., table_name='Document'), or equivalently, adding table_name='Document' to your DBConfig

The details of this messag will depend on the particular backend i suppose

Signed-off-by: Joan Fontanals Martinez <joan.martinez@jina.ai>
@JoanFM JoanFM force-pushed the feat-default-name-index branch from 7ef4fa5 to 18ea080 Compare May 4, 2023 14:28
Signed-off-by: Johannes Messner <messnerjo@gmail.com>
@JoanFM JoanFM force-pushed the feat-default-name-index branch from 32ef595 to 8d3990c Compare May 4, 2023 15:05
@JoanFM
Copy link
Member Author

JoanFM commented May 4, 2023

Since this is a breaking change I think we need to be careful to guide people to un-break their code.

Concretely what will happen is that collections/tables created with a docarray version before this will be called Document, and when a user tries to access their database with a new version of docarray it will assume a table called MyDoc or something like that, and it will fail.

I think for this case we should catch the error and add a message like:

You are trying to connect to table {whatever}, but no such table exists. If you are trying to access a table that was created with DocArray version {one smaller than this one} this could be because the default table name has changed between versions. In that case, try adding the following QdrantDocIndex(..., table_name='Document'), or equivalently, adding table_name='Document' to your DBConfig

The details of this messag will depend on the particular backend i suppose

This would be really nice, but it does not seem so feasible, nowhere in the code I see this check being done, plus I am not sure what each of the Backends throw when this happens, or they may simply create the table/etc ...

I would simply add a breaking change note in the release notes once this is released

@JoanFM JoanFM force-pushed the feat-default-name-index branch 4 times, most recently from 421df67 to 590b23f Compare May 4, 2023 15:49
Signed-off-by: Joan Fontanals Martinez <joan.martinez@jina.ai>
@JoanFM JoanFM force-pushed the feat-default-name-index branch from 590b23f to 44fcd80 Compare May 5, 2023 07:03
Signed-off-by: AnneY <evangeline-lun@foxmail.com>
Signed-off-by: AnneY <evangeline-lun@foxmail.com>
@github-actions github-actions bot added size/l and removed size/m labels May 5, 2023
@AnneYang720 AnneYang720 force-pushed the feat-default-name-index branch from 5f10ffd to 7c1acf9 Compare May 5, 2023 13:39
Signed-off-by: AnneY <evangeline-lun@foxmail.com>
@AnneYang720 AnneYang720 force-pushed the feat-default-name-index branch from 7c1acf9 to 7358f5d Compare May 5, 2023 13:57
@JoanFM
Copy link
Member Author

JoanFM commented May 5, 2023

@AnneYang720 why is it needed to change the names? why the tests do not pass with default naming?

@AnneYang720
Copy link
Contributor

@AnneYang720 why is it needed to change the names? why the tests do not pass with default naming?

The schema name is used as default index_name, and sometimes this index_name is used in some tests before. For example:

index = ElasticDocIndex[SimpleDoc]()

The index_name is simpledoc and is used in previous tests(SimpleDoc is used as schema), the index.num_docs() is not zero at the moment.

@JoanFM
Copy link
Member Author

JoanFM commented May 5, 2023

@AnneYang720 why is it needed to change the names? why the tests do not pass with default naming?

The schema name is used as default index_name, and sometimes this index_name is used in some tests before. For example:

index = ElasticDocIndex[SimpleDoc]()

The index_name is simpledoc and is used in previous tests(SimpleDoc is used as schema), the index.num_docs() is not zero at the moment.

Can we have some tests were the default name is used for indexing and retrieving in one shot?

@AnneYang720
Copy link
Contributor

AnneYang720 commented May 5, 2023

@AnneYang720 why is it needed to change the names? why the tests do not pass with default naming?

The schema name is used as default index_name, and sometimes this index_name is used in some tests before. For example:

index = ElasticDocIndex[SimpleDoc]()

The index_name is simpledoc and is used in previous tests(SimpleDoc is used as schema), the index.num_docs() is not zero at the moment.

Can we have some tests were the default name is used for indexing and retrieving in one shot?

Some tests with schema that is only used once(like FlatSchema) use default index name. And I just changed test_get_single, test_get_multiple in elastic and test_persist_and_restore in qdrant so that they also use default index name.

@AnneYang720 AnneYang720 force-pushed the feat-default-name-index branch from e201503 to 4f4c31a Compare May 5, 2023 15:07
Signed-off-by: AnneY <evangeline-lun@foxmail.com>
@AnneYang720 AnneYang720 force-pushed the feat-default-name-index branch from 4f4c31a to ff172e9 Compare May 5, 2023 15:19
@github-actions
Copy link

github-actions bot commented May 6, 2023

📝 Docs are deployed on https://ft-feat-default-name-index--jina-docs.netlify.app 🎉

@JoanFM JoanFM merged commit eb69318 into main May 6, 2023
@JoanFM JoanFM deleted the feat-default-name-index branch May 6, 2023 08:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Storing different documents in weaviate does not work because the index_name is not inferred from the BaseDoc subclass' name
3 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy