Content-Length: 488053 | pFad | https://github.com/stacklok/codegate/pull/1192

1F Added a class which performs semantic routing by aponcedeleonch · Pull Request #1192 · stacklok/codegate · GitHub
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added a class which performs semantic routing #1192

Merged
merged 2 commits into from
Mar 4, 2025
Merged

Conversation

aponcedeleonch
Copy link
Contributor

Related to: #1055

For the current implementation of muxing we only need to match a single Persona at a time. For example:

  1. mux1 -> persona Architect -> openai o1
  2. mux2 -> catch all -> openai gpt4o

In the above case we would only need to know if the request matches the persona Architect. It's not needed to match any extra personas even if they exist in DB.

This PR introduces what's necessary to do the above without actually wiring in muxing rules. The PR:

  • Creates the persona table in DB
  • Adds methods to write and read to the new persona table
  • Implements a function to check if a query matches to the specified persona

To check more about the personas and the queries please check the unit tests

@jhrozek
Copy link
Contributor

jhrozek commented Mar 4, 2025

Would it be possible to also add a special system prompt for the persona?

@aponcedeleonch
Copy link
Contributor Author

@jhrozek that's a nice suggestion. Yes, we can do that. Actually it would be nice to add a special system prompt for each of our muxing rules. Since personas is a muxing rule they would have a system prompt. The idea is tracked here: #873

Copy link
Contributor

@JAORMX JAORMX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

requesting changes mainly on the cursor closure.

description TEXT NOT NULL,
description_embedding BLOB NOT NULL
);
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a good idea to have personas not be namespaced within a workspace since this allows us to share personas between workspaces. Do you think we should also add a namespaced persona concept? This is not a blocker and if we decide a namespaced persona makes sense, this can be left as a TODO for another PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uuhm . Probably the concept of persona namespaces makes sense. Although right now I can't think on what would be the difference wrt. our workspaces. In other words, I like the idea but lack the use cases atm. Lets introduce when we need them

@@ -57,6 +57,7 @@ class Config:
force_certs: bool = False

max_fim_hash_lifetime: int = 60 * 5 # Time in seconds. Default is 5 minutes.
persona_threshold = 0.75 # Min value is 0 (max similarity), max value is 2 (orthogonal)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add also a comment in regards to why the 0.75 value was chosen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, clarifiied :)

)

try:
# For Pydantic we conver the numpy array to a string when serializing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/conver/convert/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, fixed

# Pydantic doesn't support numpy arrays out of the box. Defining a custom type
# Reference: https://github.com/pydantic/pydantic/issues/7017
def nd_array_custom_before_validator(x):
# custome before validation logic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might want to reclarify this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Let me know if more clarification is needed

text = re.sub(r"[^\w\s\']", " ", text)

# Normalize whitespace (replace multiple spaces with a single space)
text = re.sub(r"\s+", " ", text)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optimization: pre-declare and pre-compile each regular expression in the constructor or even globally so this function would only need to evaluate the regex as opposed to compile+evaluate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Changed them :)

self._embeddings_model, [cleaned_text], n_gpu_layers=self._n_gpu
)
# Use only the first entry in the list and make sure we have the appropriate type
logger.debug("Text embedded in semantic routing", text=cleaned_text[:100])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is 100 characters overkill for the debug log? would 20 be enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went with 50, 20 seems like too little. It's raw text so we should be able to know what chunk of text we embed

CREATE TABLE IF NOT EXISTS personas (
id TEXT PRIMARY KEY, -- UUID stored as TEXT
name TEXT NOT NULL UNIQUE,
description TEXT NOT NULL,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to make descriptions unique as well? If someone adds two similar descriptions, it would be very hard for the matcher to work properly. Perhaps enforcing uniqueness is the way to go as a first step, and in a further iteration we could check for description similarity. wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a really nice suggestion! But actually making the descriptions unique won't cut it. If the difference is a single letter then we will accept the new description. What I will do is to check the cosine distance to the existing descriptions and only accept a new persona if it's sufficiently different. Will upload a commit soon

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool beans

Related to: #1055

For the current implementation of muxing we only need
to match a single Persona at a time. For example:
1. mux1 -> persona Architect -> openai o1
2. mux2 -> catch all -> openai gpt4o

In the above case we would only need to know if the request
matches the persona `Architect`. It's not needed to match
any extra personas even if they exist in DB.

This PR introduces what's necessary to do the above without
actually wiring in muxing rules. The PR:
- Creates the persona table in DB
- Adds methods to write and read to the new persona table
- Implements a function to check if a query matches to the specified persona

To check more about the personas and the queries please check the unit tests
@JAORMX
Copy link
Contributor

JAORMX commented Mar 4, 2025

@jhrozek that's a nice suggestion. Yes, we can do that. Actually it would be nice to add a special system prompt for each of our muxing rules. Since personas is a muxing rule they would have a system prompt. The idea is tracked here: #873

I agree with @aponcedeleonch , it would be ideal to have a custom prompt per rule; not per persona.

Copy link
Contributor

@JAORMX JAORMX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shipit 🚢

@aponcedeleonch aponcedeleonch merged commit f61f357 into main Mar 4, 2025
11 checks passed
@aponcedeleonch aponcedeleonch deleted the semantic-routing branch March 4, 2025 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: https://github.com/stacklok/codegate/pull/1192

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy