Skip to content

fix: use docker compose #1905

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
fix: use docker compose
Signed-off-by: YuXuan Tay <wyextay@gmail.com>
  • Loading branch information
yxtay committed Aug 16, 2024
commit 2987e8945fa38b05bad6d6a1194f870cc12e42dd
4 changes: 2 additions & 2 deletions docs/user_guide/storing/doc_store/store_s3.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ When you want to use your [`DocList`][docarray.DocList] in another place, you ca
## Push & pull
To use the store [`DocList`][docarray.DocList] on S3, you need to pass an S3 path to the function starting with `'s3://'`.

In the following demo, we use `MinIO` as a local S3 service. You could use the following docker-compose file to start the service in a Docker container.
In the following demo, we use `MinIO` as a local S3 service. You could use the following docker compose file to start the service in a Docker container.

```yaml
version: "3"
Expand All @@ -26,7 +26,7 @@ services:
```
Save the above file as `docker-compose.yml` and run the following line in the same folder as the file.
```cmd
docker-compose up
docker compose up
```

```python
Expand Down
16 changes: 9 additions & 7 deletions docs/user_guide/storing/index_elastic.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,13 +45,17 @@ from docarray.index import ElasticDocIndex # or ElasticV7DocIndex
from docarray.typing import NdArray
import numpy as np


# Define the document schema.
class MyDoc(BaseDoc):
title: str
title: str
embedding: NdArray[128]


# Create dummy documents.
docs = DocList[MyDoc](MyDoc(title=f'title #{i}', embedding=np.random.rand(128)) for i in range(10))
docs = DocList[MyDoc](
MyDoc(title=f'title #{i}', embedding=np.random.rand(128)) for i in range(10)
)

# Initialize a new ElasticDocIndex instance and add the documents to the index.
doc_index = ElasticDocIndex[MyDoc](index_name='my_index')
Expand All @@ -67,7 +71,7 @@ retrieved_docs = doc_index.find(query, search_field='embedding', limit=10)
## Initialize


You can use docker-compose to create a local Elasticsearch service with the following `docker-compose.yml`.
You can use docker compose to create a local Elasticsearch service with the following `docker-compose.yml`.

```yaml
version: "3.3"
Expand All @@ -91,7 +95,7 @@ networks:
Run the following command in the folder of the above `docker-compose.yml` to start the service:

```bash
docker-compose up
docker compose up
```

### Schema definition
Expand Down Expand Up @@ -225,9 +229,7 @@ You can also search for multiple documents at once, in a batch, using the [`find

```python
# create some query Documents
queries = DocList[SimpleDoc](
SimpleDoc(tensor=np.random.rand(128)) for i in range(3)
)
queries = DocList[SimpleDoc](SimpleDoc(tensor=np.random.rand(128)) for i in range(3))

# find similar documents
matches, scores = doc_index.find_batched(queries, search_field='tensor', limit=5)
Expand Down
27 changes: 20 additions & 7 deletions docs/user_guide/storing/index_milvus.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,17 @@ from docarray.typing import NdArray
from pydantic import Field
import numpy as np


# Define the document schema.
class MyDoc(BaseDoc):
title: str
title: str
embedding: NdArray[128] = Field(is_embedding=True)


# Create dummy documents.
docs = DocList[MyDoc](MyDoc(title=f'title #{i}', embedding=np.random.rand(128)) for i in range(10))
docs = DocList[MyDoc](
MyDoc(title=f'title #{i}', embedding=np.random.rand(128)) for i in range(10)
)

# Initialize a new MilvusDocumentIndex instance and add the documents to the index.
doc_index = MilvusDocumentIndex[MyDoc](index_name='tmp_index_1')
Expand All @@ -55,7 +59,7 @@ wget https://github.com/milvus-io/milvus/releases/download/v2.2.11/milvus-standa

And start Milvus by running:
```shell
sudo docker-compose up -d
sudo docker compose up -d
```

Learn more on [Milvus documentation](https://milvus.io/docs/install_standalone-docker.md).
Expand Down Expand Up @@ -142,10 +146,12 @@ Now that you have a Document Index, you can add data to it, using the [`index()`
import numpy as np
from docarray import DocList


class MyDoc(BaseDoc):
title: str
title: str
embedding: NdArray[128] = Field(is_embedding=True)


doc_index = MilvusDocumentIndex[MyDoc](index_name='tmp_index_5')

# create some random data
Expand Down Expand Up @@ -273,7 +279,9 @@ class Book(BaseDoc):
embedding: NdArray[10] = Field(is_embedding=True)


books = DocList[Book]([Book(price=i * 10, embedding=np.random.rand(10)) for i in range(10)])
books = DocList[Book](
[Book(price=i * 10, embedding=np.random.rand(10)) for i in range(10)]
)
book_index = MilvusDocumentIndex[Book](index_name='tmp_index_6')
book_index.index(books)

Expand Down Expand Up @@ -312,8 +320,11 @@ class SimpleSchema(BaseDoc):
price: int
embedding: NdArray[128] = Field(is_embedding=True)


# Create dummy documents.
docs = DocList[SimpleSchema](SimpleSchema(price=i, embedding=np.random.rand(128)) for i in range(10))
docs = DocList[SimpleSchema](
SimpleSchema(price=i, embedding=np.random.rand(128)) for i in range(10)
)

doc_index = MilvusDocumentIndex[SimpleSchema](index_name='tmp_index_7')
doc_index.index(docs)
Expand Down Expand Up @@ -407,7 +418,9 @@ You can pass any of the above as keyword arguments to the `__init__()` method or

```python
class SimpleDoc(BaseDoc):
tensor: NdArray[128] = Field(is_embedding=True, index_type='IVF_FLAT', metric_type='L2')
tensor: NdArray[128] = Field(
is_embedding=True, index_type='IVF_FLAT', metric_type='L2'
)


doc_index = MilvusDocumentIndex[SimpleDoc](index_name='tmp_index_10')
Expand Down
51 changes: 31 additions & 20 deletions docs/user_guide/storing/index_qdrant.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,17 @@ from docarray.index import QdrantDocumentIndex
from docarray.typing import NdArray
import numpy as np


# Define the document schema.
class MyDoc(BaseDoc):
title: str
title: str
embedding: NdArray[128]


# Create dummy documents.
docs = DocList[MyDoc](MyDoc(title=f'title #{i}', embedding=np.random.rand(128)) for i in range(10))
docs = DocList[MyDoc](
MyDoc(title=f'title #{i}', embedding=np.random.rand(128)) for i in range(10)
)

# Initialize a new QdrantDocumentIndex instance and add the documents to the index.
doc_index = QdrantDocumentIndex[MyDoc](host='localhost')
Expand All @@ -46,7 +50,7 @@ You can initialize [QdrantDocumentIndex][docarray.index.backends.qdrant.QdrantDo

**Connecting to a local Qdrant instance running as a Docker container**

You can use docker-compose to create a local Qdrant service with the following `docker-compose.yml`.
You can use docker compose to create a local Qdrant service with the following `docker-compose.yml`.

```yaml
version: '3.8'
Expand All @@ -66,7 +70,7 @@ services:
Run the following command in the folder of the above `docker-compose.yml` to start the service:

```bash
docker-compose up
docker compose up
```

Next, you can create a [QdrantDocumentIndex][docarray.index.backends.qdrant.QdrantDocumentIndex] instance using:
Expand All @@ -89,7 +93,7 @@ doc_index = QdrantDocumentIndex[MyDoc](qdrant_config)
**Connecting to Qdrant Cloud service**
```python
qdrant_config = QdrantDocumentIndex.DBConfig(
"https://YOUR-CLUSTER-URL.aws.cloud.qdrant.io",
"https://YOUR-CLUSTER-URL.aws.cloud.qdrant.io",
api_key="<your-api-key>",
)
doc_index = QdrantDocumentIndex[MyDoc](qdrant_config)
Expand Down Expand Up @@ -317,9 +321,7 @@ book_index = QdrantDocumentIndex[Book]()
book_index.index(books)

# filter for books that are cheaper than 29 dollars
query = rest.Filter(
must=[rest.FieldCondition(key='price', range=rest.Range(lt=29))]
)
query = rest.Filter(must=[rest.FieldCondition(key='price', range=rest.Range(lt=29))])
cheap_books = book_index.filter(filter_query=query)

assert len(cheap_books) == 3
Expand Down Expand Up @@ -372,24 +374,26 @@ class SimpleDoc(BaseDoc):

doc_index = QdrantDocumentIndex[SimpleDoc](host='localhost')
index_docs = [
SimpleDoc(id=f'{i}', tens=np.ones(10) * i, num=int(i / 2), text=f'Lorem ipsum {int(i/2)}')
SimpleDoc(
id=f'{i}', tens=np.ones(10) * i, num=int(i / 2), text=f'Lorem ipsum {int(i/2)}'
)
for i in range(10)
]
doc_index.index(index_docs)

find_query = np.ones(10)
text_search_query = 'ipsum 1'
filter_query = rest.Filter(
must=[
rest.FieldCondition(
key='num',
range=rest.Range(
gte=1,
lt=5,
),
)
]
)
must=[
rest.FieldCondition(
key='num',
range=rest.Range(
gte=1,
lt=5,
),
)
]
)

query = (
doc_index.build_query()
Expand Down Expand Up @@ -437,6 +441,8 @@ import numpy as np
from docarray import BaseDoc, DocList
from docarray.typing import NdArray
from docarray.index import QdrantDocumentIndex


class MyDoc(BaseDoc):
text: str
embedding: NdArray[128]
Expand All @@ -445,7 +451,12 @@ class MyDoc(BaseDoc):
Now, we can instantiate our Index and add some data:
```python
docs = DocList[MyDoc](
[MyDoc(embedding=np.random.rand(10), text=f'I am the first version of Document {i}') for i in range(100)]
[
MyDoc(
embedding=np.random.rand(10), text=f'I am the first version of Document {i}'
)
for i in range(100)
]
)
index = QdrantDocumentIndex[MyDoc]()
index.index(docs)
Expand Down
38 changes: 21 additions & 17 deletions docs/user_guide/storing/index_weaviate.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,13 +27,17 @@ from docarray.typing import NdArray
from pydantic import Field
import numpy as np


# Define the document schema.
class MyDoc(BaseDoc):
title: str
title: str
embedding: NdArray[128] = Field(is_embedding=True)


# Create dummy documents.
docs = DocList[MyDoc](MyDoc(title=f'title #{i}', embedding=np.random.rand(128)) for i in range(10))
docs = DocList[MyDoc](
MyDoc(title=f'title #{i}', embedding=np.random.rand(128)) for i in range(10)
)

# Initialize a new WeaviateDocumentIndex instance and add the documents to the index.
doc_index = WeaviateDocumentIndex[MyDoc]()
Expand All @@ -59,7 +63,7 @@ There are multiple ways to start a Weaviate instance, depending on your use case
| ----- | ----- | ----- | ----- |
| **Weaviate Cloud Services (WCS)** | Development and production | Limited | **Recommended for most users** |
| **Embedded Weaviate** | Experimentation | Limited | Experimental (as of Apr 2023) |
| **Docker-Compose** | Development | Yes | **Recommended for development + customizability** |
| **Docker Compose** | Development | Yes | **Recommended for development + customizability** |
| **Kubernetes** | Production | Yes | |

### Instantiation instructions
Expand All @@ -70,7 +74,7 @@ Go to the [WCS console](https://console.weaviate.cloud) and create an instance u

Weaviate instances on WCS come pre-configured, so no further configuration is required.

**Docker-Compose (self-managed)**
**Docker Compose (self-managed)**

Get a configuration file (`docker-compose.yaml`). You can build it using [this interface](https://weaviate.io/developers/weaviate/installation/docker-compose), or download it directly with:

Expand All @@ -84,20 +88,20 @@ Where `v<WEAVIATE_VERSION>` is the actual version, such as `v1.18.3`.
curl -o docker-compose.yml "https://configuration.weaviate.io/v2/docker-compose/docker-compose.yml?modules=standalone&runtime=docker-compose&weaviate_version=v1.18.3"
```

**Start up Weaviate with Docker-Compose**
**Start up Weaviate with Docker Compose**

Then you can start up Weaviate by running from a shell:

```shell
docker-compose up -d
docker compose up -d
```

**Shut down Weaviate**

Then you can shut down Weaviate by running from a shell:

```shell
docker-compose down
docker compose down
```

**Notes**
Expand All @@ -107,7 +111,7 @@ Unless data persistence or backups are set up, shutting down the Docker instance
See documentation on [Persistent volume](https://weaviate.io/developers/weaviate/installation/docker-compose#persistent-volume) and [Backups](https://weaviate.io/developers/weaviate/configuration/backups) to prevent this if persistence is desired.

```bash
docker-compose up -d
docker compose up -d
```

**Embedded Weaviate (from the application)**
Expand Down Expand Up @@ -192,9 +196,7 @@ dbconfig = WeaviateDocumentIndex.DBConfig(
### Create an instance
Let's connect to a local Weaviate service and instantiate a `WeaviateDocumentIndex` instance:
```python
dbconfig = WeaviateDocumentIndex.DBConfig(
host="http://localhost:8080"
)
dbconfig = WeaviateDocumentIndex.DBConfig(host="http://localhost:8080")
doc_index = WeaviateDocumentIndex[MyDoc](db_config=dbconfig)
```

Expand Down Expand Up @@ -378,10 +380,10 @@ the [`find()`][docarray.index.abstract.BaseDocIndex.find] method:
embedding=np.array([1, 2]),
file=np.random.rand(100),
)

# find similar documents
matches, scores = doc_index.find(query, limit=5)

print(f"{matches=}")
print(f"{matches.text=}")
print(f"{scores=}")
Expand Down Expand Up @@ -428,10 +430,10 @@ You can also search for multiple documents at once, in a batch, using the [`find
)
for i in range(3)
)

# find similar documents
matches, scores = doc_index.find_batched(queries, limit=5)

print(f"{matches=}")
print(f"{matches[0].text=}")
print(f"{scores=}")
Expand Down Expand Up @@ -481,7 +483,9 @@ class Book(BaseDoc):
embedding: NdArray[10] = Field(is_embedding=True)


books = DocList[Book]([Book(price=i * 10, embedding=np.random.rand(10)) for i in range(10)])
books = DocList[Book](
[Book(price=i * 10, embedding=np.random.rand(10)) for i in range(10)]
)
book_index = WeaviateDocumentIndex[Book](index_name='tmp_index')
book_index.index(books)

Expand Down Expand Up @@ -602,7 +606,7 @@ del doc_index[ids[1:]] # del by list of ids

**WCS instances come pre-configured**, and as such additional settings are not configurable outside of those chosen at creation, such as whether to enable authentication.

For other cases, such as **Docker-Compose deployment**, its settings can be modified through the configuration file, such as the `docker-compose.yaml` file.
For other cases, such as **Docker Compose deployment**, its settings can be modified through the configuration file, such as the `docker-compose.yaml` file.

Some of the more commonly used settings include:

Expand Down
Loading
Loading
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy