-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added a class which performs semantic routing #1192
Conversation
Would it be possible to also add a special system prompt for the persona? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
requesting changes mainly on the cursor closure.
description TEXT NOT NULL, | ||
description_embedding BLOB NOT NULL | ||
); | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a good idea to have personas not be namespaced within a workspace since this allows us to share personas between workspaces. Do you think we should also add a namespaced persona concept? This is not a blocker and if we decide a namespaced persona makes sense, this can be left as a TODO for another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uuhm . Probably the concept of persona namespaces makes sense. Although right now I can't think on what would be the difference wrt. our workspaces. In other words, I like the idea but lack the use cases atm. Lets introduce when we need them
src/codegate/config.py
Outdated
@@ -57,6 +57,7 @@ class Config: | |||
force_certs: bool = False | |||
|
|||
max_fim_hash_lifetime: int = 60 * 5 # Time in seconds. Default is 5 minutes. | |||
persona_threshold = 0.75 # Min value is 0 (max similarity), max value is 2 (orthogonal) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add also a comment in regards to why the 0.75
value was chosen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, clarifiied :)
src/codegate/db/connection.py
Outdated
) | ||
|
||
try: | ||
# For Pydantic we conver the numpy array to a string when serializing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/conver/convert/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, fixed
# Pydantic doesn't support numpy arrays out of the box. Defining a custom type | ||
# Reference: https://github.com/pydantic/pydantic/issues/7017 | ||
def nd_array_custom_before_validator(x): | ||
# custome before validation logic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you might want to reclarify this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! Let me know if more clarification is needed
text = re.sub(r"[^\w\s\']", " ", text) | ||
|
||
# Normalize whitespace (replace multiple spaces with a single space) | ||
text = re.sub(r"\s+", " ", text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optimization: pre-declare and pre-compile each regular expression in the constructor or even globally so this function would only need to evaluate the regex as opposed to compile+evaluate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. Changed them :)
self._embeddings_model, [cleaned_text], n_gpu_layers=self._n_gpu | ||
) | ||
# Use only the first entry in the list and make sure we have the appropriate type | ||
logger.debug("Text embedded in semantic routing", text=cleaned_text[:100]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: is 100
characters overkill for the debug log? would 20 be enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with 50, 20 seems like too little. It's raw text so we should be able to know what chunk of text we embed
CREATE TABLE IF NOT EXISTS personas ( | ||
id TEXT PRIMARY KEY, -- UUID stored as TEXT | ||
name TEXT NOT NULL UNIQUE, | ||
description TEXT NOT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to make descriptions unique as well? If someone adds two similar descriptions, it would be very hard for the matcher to work properly. Perhaps enforcing uniqueness is the way to go as a first step, and in a further iteration we could check for description similarity. wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a really nice suggestion! But actually making the descriptions unique won't cut it. If the difference is a single letter then we will accept the new description. What I will do is to check the cosine distance to the existing descriptions and only accept a new persona if it's sufficiently different. Will upload a commit soon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool beans
Related to: #1055 For the current implementation of muxing we only need to match a single Persona at a time. For example: 1. mux1 -> persona Architect -> openai o1 2. mux2 -> catch all -> openai gpt4o In the above case we would only need to know if the request matches the persona `Architect`. It's not needed to match any extra personas even if they exist in DB. This PR introduces what's necessary to do the above without actually wiring in muxing rules. The PR: - Creates the persona table in DB - Adds methods to write and read to the new persona table - Implements a function to check if a query matches to the specified persona To check more about the personas and the queries please check the unit tests
I agree with @aponcedeleonch , it would be ideal to have a custom prompt per rule; not per persona. |
9dac0af
to
0e37312
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shipit 🚢
Related to: #1055
For the current implementation of muxing we only need to match a single Persona at a time. For example:
In the above case we would only need to know if the request matches the persona
Architect
. It's not needed to match any extra personas even if they exist in DB.This PR introduces what's necessary to do the above without actually wiring in muxing rules. The PR:
To check more about the personas and the queries please check the unit tests