Skip to content

[DRAFT] ESQL: Add TEXT_EMBEDDING function for dense vector embeddings #131131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 30 commits into
base: main
Choose a base branch
from

Conversation

afoucret
Copy link
Contributor

@afoucret afoucret commented Jul 11, 2025

Summary

Implements the TEXT_EMBEDDING function for ES|QL to generate dense vector embeddings from text using inference model.

Function Signature: TEXT_EMBEDDING(text: string, inference_id: string) -> dense_vector

Example Usage:

FROM documents 
| WHERE KNN(embedding_field, TEXT_EMBEDDING(content, "my-embedding-deployment"), 10)

Implementation Status

Completed in this PR:

  • TEXT_EMBEDDING_FUNCTION capability with snapshot build gating

  • Core Function Infrastructure

    • TextEmbedding function class with proper type validation and serialization
    • InferenceFunction interface for inference-based functions
    • Function registration in EsqlFunctionRegistry
  • Analysis of the inference function (validate existence and type of the inference endpoint)

    • Refactored pre-analysis, so it is able to collect inference ids form both Inference plans and inference function
    • Added validation for inference function in the analysis
  • Add a pre-optimizer async phase to the ES|QL query execution

  • Documentation generated from the annotations

  • Execute the inference in the pre-optimizer

  • Integration tests and end-to-end validation

🚧 TODO (Before Merge):

  • Better CSV tests
  • Integration tests

Notes

The function is enabled only in snapshot builds.
TEXT_EMBEDDING function is tracked into #131022

@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.2.0 labels Jul 11, 2025
@afoucret afoucret marked this pull request as draft July 11, 2025 21:27
Copy link
Contributor

github-actions bot commented Jul 11, 2025

🔍 Preview links for changed docs

@afoucret afoucret changed the title [DRAFT] ESQL: Add EMBED_TEXT function for dense vector embeddings [DRAFT] ESQL: Add TEXT_EMBEDDING function for dense vector embeddings Jul 11, 2025
@afoucret afoucret force-pushed the esql-text-embedding-function branch from 0b63243 to 0f9279b Compare July 16, 2025 16:48
@afoucret afoucret force-pushed the esql-text-embedding-function branch from 8ef234f to f58d40a Compare July 17, 2025 08:56
@afoucret afoucret force-pushed the esql-text-embedding-function branch from 03cc2d0 to c644d6d Compare July 17, 2025 09:29
@afoucret afoucret force-pushed the esql-text-embedding-function branch from c644d6d to aee9c19 Compare July 17, 2025 13:17
@afoucret afoucret force-pushed the esql-text-embedding-function branch from 8588320 to c30c0ec Compare July 17, 2025 13:50
@afoucret afoucret force-pushed the esql-text-embedding-function branch from c30c0ec to 63c5539 Compare July 17, 2025 15:09
afoucret added 28 commits July 21, 2025 15:45
Implements the core evaluation logic for the TEXT_EMBEDDING function in ES|QL:
- Add InferenceFunctionEvaluator interface for all inference functions
- Implement TextEmbeddingFunctionEvaluator with support for float/byte/bit vectors
- Integration with InferenceRunner for async model execution
- Proper conversion of embedding results to DENSE_VECTOR data type
Integrates the TEXT_EMBEDDING function with the ESQL execution pipeline:
- Update PreOptimizer to handle TEXT_EMBEDDING function evaluation
- Add TextEmbedding function definition and type validation
- Integrate with InferenceServices for model execution
- Add comprehensive tests in PreOptimizerTests
- Update session and execution components for async function support
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs:triage Requires assignment of a team area label v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy