Content-Length: 262777 | pFad | http://github.com/stacklok/codegate/pull/1344

B8 Load RAG database in memory on use. by blkt · Pull Request #1344 · stacklok/codegate · GitHub
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load RAG database in memory on use. #1344

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Load RAG database in memory on use. #1344

wants to merge 1 commit into from

Conversation

blkt
Copy link
Contributor

@blkt blkt commented Apr 8, 2025

This change uses a trick to force the RAG database into memory by dumping the SQLite database from file to a temporary in-memory version of it using SQLite's backup function.

Based on percall statistic, this almost halves execution times.

With on-disk database.

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      911    0.017    0.000   47.983    0.053 .../src/codegate/storage/storage_engine.py:149(search)
      911   41.336    0.045   41.336    0.045 {method 'execute' of 'sqlite3.Cursor' objects}
      911    0.007    0.000    6.489    0.007 .../src/codegate/inference/inference_engine.py:90(embed)

With in-memory database.

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   658    0.009    0.000   21.198    0.032 .../src/codegate/storage/storage_engine.py:147(search)
   658   16.458    0.025   16.458    0.025 {method 'execute' of 'sqlite3.Cursor' objects}
   658    0.004    0.000    4.638    0.007 .../src/codegate/inference/inference_engine.py:90(embed)

Also, added profiling to providers and search. Profiling can be activated exporting CODEGATE_PROFILE_<key> where <key> is the string passed to @profiled annotations. Profiling points are statically defined throughout the codebase.

@blkt blkt self-assigned this Apr 8, 2025
@blkt blkt force-pushed the perf/in-memory-rag branch from 28d9695 to 2ed45bd Compare April 8, 2025 13:41
This change uses a trick to force the RAG database into memory by
dumping the SQLite database from file to a temporary in-memory version
of it using SQLite's `backup` function.

Based on `percall` statistic, this almost halves execution times.

With on-disk database.

```
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      911    0.017    0.000   47.983    0.053 .../src/codegate/storage/storage_engine.py:149(search)
      911   41.336    0.045   41.336    0.045 {method 'execute' of 'sqlite3.Cursor' objects}
      911    0.007    0.000    6.489    0.007 .../src/codegate/inference/inference_engine.py:90(embed)
```

With in-memory database.

```
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   658    0.009    0.000   21.198    0.032 .../src/codegate/storage/storage_engine.py:147(search)
   658   16.458    0.025   16.458    0.025 {method 'execute' of 'sqlite3.Cursor' objects}
   658    0.004    0.000    4.638    0.007 .../src/codegate/inference/inference_engine.py:90(embed)
```

Also, added profiling to providers and search. Profiling can be
activated exporting `CODEGATE_PROFILE_<key>` where `<key>` is the
string passed to `@profiled` annotations. Profiling points are
statically defined throughout the codebase.
@blkt blkt force-pushed the perf/in-memory-rag branch from 2ed45bd to 3b3b945 Compare April 10, 2025 07:45
@lukehinds
Copy link
Contributor

Nice, taking a look at this, what is the memory footprint like?

@blkt
Copy link
Contributor Author

blkt commented Apr 11, 2025

I'm sorry @lukehinds I missed your comment.
Memory footprint is arund ~450 MB, and the database i ~200 MB itself, so it's still manageable on any developer's machine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/stacklok/codegate/pull/1344

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy