Skip to content

server : fix pooled embedding output #14645

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 12, 2025

Conversation

iamlemec
Copy link
Collaborator

Fix pooled embedding server output issue reported in #14543. When using pooled embeddings, the response is now one vector per prompt entry. The response for unpooled embeddings is unchanged.

I'm not very familiar with reranking, so I didn't touch that path. But it seems like a similar fix could be in order there.

@ggerganov ggerganov merged commit 0c1df14 into ggml-org:master Jul 12, 2025
48 checks passed
@ggerganov
Copy link
Member

I'm also not very familiar with rerank. This change looks OK, so merging.

@CISC CISC linked an issue Jul 12, 2025 that may be closed by this pull request
@brunette69-ruby
Copy link

Ty. I do have to mention that while investigating this I found a difference between accessing 1) /embedding or 2) openai url v1/embeddings. 2) V1/embeddings seemed to work producing one vector. On v1 reranking seem to work also.
For the record.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Misc. bug: Embedding/pooling: I receive 10xvector not 1xvector
3 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy