Added a class which performs semantic routing #1192

aponcedeleonch · 2025-03-03T16:39:24Z

Related to: #1055

For the current implementation of muxing we only need to match a single Persona at a time. For example:

mux1 -> persona Architect -> openai o1
mux2 -> catch all -> openai gpt4o

In the above case we would only need to know if the request matches the persona Architect. It's not needed to match any extra personas even if they exist in DB.

This PR introduces what's necessary to do the above without actually wiring in muxing rules. The PR:

Creates the persona table in DB
Adds methods to write and read to the new persona table
Implements a function to check if a query matches to the specified persona

To check more about the personas and the queries please check the unit tests

jhrozek · 2025-03-04T09:07:03Z

Would it be possible to also add a special system prompt for the persona?

aponcedeleonch · 2025-03-04T09:22:18Z

@jhrozek that's a nice suggestion. Yes, we can do that. Actually it would be nice to add a special system prompt for each of our muxing rules. Since personas is a muxing rule they would have a system prompt. The idea is tracked here: #873

JAORMX

requesting changes mainly on the cursor closure.

JAORMX · 2025-03-04T09:04:53Z

migrations/versions/2025_03_03_1008-02b710eda156_add_persona_table.py

+                description TEXT NOT NULL,
+                description_embedding BLOB NOT NULL
+            );
+            """


I think it's a good idea to have personas not be namespaced within a workspace since this allows us to share personas between workspaces. Do you think we should also add a namespaced persona concept? This is not a blocker and if we decide a namespaced persona makes sense, this can be left as a TODO for another PR.

Uuhm . Probably the concept of persona namespaces makes sense. Although right now I can't think on what would be the difference wrt. our workspaces. In other words, I like the idea but lack the use cases atm. Lets introduce when we need them

JAORMX · 2025-03-04T09:05:18Z

src/codegate/config.py

@@ -57,6 +57,7 @@ class Config:
    force_certs: bool = False

    max_fim_hash_lifetime: int = 60 * 5  # Time in seconds. Default is 5 minutes.
+    persona_threshold = 0.75  # Min value is 0 (max similarity), max value is 2 (orthogonal)


Can you add also a comment in regards to why the 0.75 value was chosen?

yes, clarifiied :)

JAORMX · 2025-03-04T09:07:03Z

src/codegate/db/connection.py

+        )
+
+        try:
+            # For Pydantic we conver the numpy array to a string when serializing.


s/conver/convert/

thanks, fixed

src/codegate/db/connection.py

JAORMX · 2025-03-04T09:14:04Z

src/codegate/db/models.py

+# Pydantic doesn't support numpy arrays out of the box. Defining a custom type
+# Reference: https://github.com/pydantic/pydantic/issues/7017
+def nd_array_custom_before_validator(x):
+    # custome before validation logic


you might want to reclarify this comment.

Done! Let me know if more clarification is needed

src/codegate/muxing/semantic_router.py

JAORMX · 2025-03-04T09:19:19Z

src/codegate/muxing/semantic_router.py

+        text = re.sub(r"[^\w\s\']", " ", text)
+
+        # Normalize whitespace (replace multiple spaces with a single space)
+        text = re.sub(r"\s+", " ", text)


optimization: pre-declare and pre-compile each regular expression in the constructor or even globally so this function would only need to evaluate the regex as opposed to compile+evaluate.

You're right. Changed them :)

JAORMX · 2025-03-04T09:20:27Z

src/codegate/muxing/semantic_router.py

+            self._embeddings_model, [cleaned_text], n_gpu_layers=self._n_gpu
+        )
+        # Use only the first entry in the list and make sure we have the appropriate type
+        logger.debug("Text embedded in semantic routing", text=cleaned_text[:100])


nit: is 100 characters overkill for the debug log? would 20 be enough?

I went with 50, 20 seems like too little. It's raw text so we should be able to know what chunk of text we embed

migrations/versions/2025_03_03_1008-02b710eda156_add_persona_table.py

JAORMX · 2025-03-04T09:24:27Z

migrations/versions/2025_03_03_1008-02b710eda156_add_persona_table.py

+            CREATE TABLE IF NOT EXISTS personas (
+                id TEXT PRIMARY KEY,  -- UUID stored as TEXT
+                name TEXT NOT NULL UNIQUE,
+                description TEXT NOT NULL,


Do we want to make descriptions unique as well? If someone adds two similar descriptions, it would be very hard for the matcher to work properly. Perhaps enforcing uniqueness is the way to go as a first step, and in a further iteration we could check for description similarity. wdyt?

That's a really nice suggestion! But actually making the descriptions unique won't cut it. If the difference is a single letter then we will accept the new description. What I will do is to check the cosine distance to the existing descriptions and only accept a new persona if it's sufficiently different. Will upload a commit soon

Related to: #1055 For the current implementation of muxing we only need to match a single Persona at a time. For example: 1. mux1 -> persona Architect -> openai o1 2. mux2 -> catch all -> openai gpt4o In the above case we would only need to know if the request matches the persona `Architect`. It's not needed to match any extra personas even if they exist in DB. This PR introduces what's necessary to do the above without actually wiring in muxing rules. The PR: - Creates the persona table in DB - Adds methods to write and read to the new persona table - Implements a function to check if a query matches to the specified persona To check more about the personas and the queries please check the unit tests

JAORMX · 2025-03-04T11:05:26Z

@jhrozek that's a nice suggestion. Yes, we can do that. Actually it would be nice to add a special system prompt for each of our muxing rules. Since personas is a muxing rule they would have a system prompt. The idea is tracked here: #873

I agree with @aponcedeleonch , it would be ideal to have a custom prompt per rule; not per persona.

JAORMX

shipit 🚢

aponcedeleonch requested review from JAORMX, lukehinds and ptelang March 3, 2025 16:39

JAORMX requested changes Mar 4, 2025

View reviewed changes

Attended PR comments

0e37312

aponcedeleonch force-pushed the semantic-routing branch from 9dac0af to 0e37312 Compare March 4, 2025 11:18

JAORMX approved these changes Mar 4, 2025

View reviewed changes

aponcedeleonch merged commit f61f357 into main Mar 4, 2025
11 checks passed

aponcedeleonch deleted the semantic-routing branch March 4, 2025 11:42

aponcedeleonch mentioned this pull request Mar 5, 2025

[Task]: Create Personas with descriptions #1218

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added a class which performs semantic routing #1192

Added a class which performs semantic routing #1192

aponcedeleonch commented Mar 3, 2025

jhrozek commented Mar 4, 2025

aponcedeleonch commented Mar 4, 2025

JAORMX left a comment

JAORMX Mar 4, 2025

aponcedeleonch Mar 4, 2025

JAORMX Mar 4, 2025

aponcedeleonch Mar 4, 2025

JAORMX Mar 4, 2025

aponcedeleonch Mar 4, 2025

JAORMX Mar 4, 2025

aponcedeleonch Mar 4, 2025

JAORMX Mar 4, 2025

aponcedeleonch Mar 4, 2025

JAORMX Mar 4, 2025

aponcedeleonch Mar 4, 2025

JAORMX Mar 4, 2025

aponcedeleonch Mar 4, 2025

JAORMX Mar 4, 2025

JAORMX commented Mar 4, 2025

JAORMX left a comment

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!

Added a class which performs semantic routing #1192

Added a class which performs semantic routing #1192

Conversation

aponcedeleonch commented Mar 3, 2025

jhrozek commented Mar 4, 2025

aponcedeleonch commented Mar 4, 2025

JAORMX left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JAORMX commented Mar 4, 2025

JAORMX left a comment

Choose a reason for hiding this comment