You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: pgml-cms/docs/README.md
+6-6Lines changed: 6 additions & 6 deletions
Original file line number
Diff line number
Diff line change
@@ -4,27 +4,27 @@ description: The key concepts that make up PostgresML.
4
4
5
5
# Overview
6
6
7
-
PostgresML is a complete MLOps platform built on PostgreSQL.
7
+
PostgresML is a complete MLOps platform built on PostgreSQL. 
8
8
9
9
> _Move the models to the database_, _rather than continuously moving the data to the models._
10
10
11
-
The data for ML & AI systems is inherently larger and more dynamic than the models. It's more efficient, manageable and reliable to move the models to the database, rather than continuously moving the data to the models\_.\_ PostgresML allows you to take advantage of the fundamental relationship between data and models, by extending the database with the following capabilities and goals:
11
+
The data for ML & AI systems is inherently larger and more dynamic than the models. It's more efficient, manageable and reliable to move the models to the database, rather than continuously moving the data to the models. PostgresML allows you to take advantage of the fundamental relationship between data and models, by extending the database with the following capabilities and goals:
12
12
13
13
***Model Serving** - _**GPU accelerated**_ inference engine for interactive applications, with no additional networking latency or reliability costs.
14
14
***Model Store** - Download _**open-source**_ models including state of the art LLMs from HuggingFace, and track changes in performance between versions.
15
15
***Model Training** - Train models with _**your application data**_ using more than 50 algorithms for regression, classification or clustering tasks. Fine tune pre-trained models like LLaMA and BERT to improve performance.
16
-
***Feature Store** - _**Scalable**_ access to model inputs, including vector, text, categorical, and numeric data. Vector database, text search, knowledge graph and application data all in one _**low-latency**_ system.
16
+
***Feature Store** - _**Scalable**_ access to model inputs, including vector, text, categorical, and numeric data. Vector database, text search, knowledge graph and application data all in one _**low-latency**_ system. 
17
17
18
18
<figure><imgsrc=".gitbook/assets/ml_system.svg"alt="Machine Learning Infrastructure (2.0) by a16z"><figcaption><p>PostgresML handles all of the functions typically performed by a cacophony of services, <ahref="https://a16z.com/emerging-architectures-for-modern-data-infrastructure/">described by a16z</a></p></figcaption></figure>
19
19
20
-
These capabilities are primarily provided by two open-source software projects, that may be used independently, but are designed to be used with the rest of the Postgres ecosystem, including trusted extensions like pgvector and pg\_partman.
20
+
These capabilities are primarily provided by two open-source software projects, that may be used independently, but are designed to be used with the rest of the Postgres ecosystem, including trusted extensions like pgvector and pg\_partman. 
21
21
22
22
***pgml** is an open source extension for PostgreSQL. It adds support for GPUs and the latest ML & AI algorithms _**inside**_ the database with a SQL API and no additional infrastructure, networking latency, or reliability costs.
23
23
***PgCat** is an open source proxy pooler for PostgreSQL. It abstracts the scalability and reliability concerns of managing a distributed cluster of Postgres databases. Client applications connect only to the proxy, which handles load balancing and failover, _**outside**_ of any single database.
24
24
25
25
<figure><imgsrc=".gitbook/assets/architecture.png"alt="PostgresML architectural diagram"width="275"><figcaption><p>A PostgresML deployment at scale</p></figcaption></figure>
26
26
27
-
In addition, PostgresML provides [native language SDKs](https://github.com/postgresml/postgresml/tree/master/pgml-sdks/pgml) to implement best practices for common ML & AI applications. The JavaScript and Python SDKs are generated from the core Rust SDK, to provide the same API, correctness and efficiency across all application runtimes.
27
+
In addition, PostgresML provides [native language SDKs](https://github.com/postgresml/postgresml/tree/master/pgml-sdks/pgml) to implement best practices for common ML & AI applications. The JavaScript and Python SDKs are generated from the core Rust SDK, to provide the same API, correctness and efficiency across all application runtimes. 
28
28
29
29
SDK clients can perform advanced machine learning tasks in a single SQL request, without having to transfer additional data, models, hardware or dependencies to the client application. For example:
30
30
@@ -36,6 +36,6 @@ SDK clients can perform advanced machine learning tasks in a single SQL request,
36
36
* Forecasting timeseries data for key metrics with complex metadata
37
37
* Fraud and anomaly detection with application data
38
38
39
-
Our goal is to provide access to Open Source AI for everyone. PostgresML is under continuous development to keep up with the rapidly evolving use cases for ML & AI, and we release non breaking changes with minor version updates in accordance with SemVer. We welcome contributions to our [open source code and documentation](https://github.com/postgresml).
39
+
Our goal is to provide access to Open Source AI for everyone. PostgresML is under continuous development to keep up with the rapidly evolving use cases for ML & AI, and we release non breaking changes with minor version updates in accordance with SemVer. We welcome contributions to our [open source code and documentation](https://github.com/postgresml). 
40
40
41
41
We can host your AI database in our cloud, or you can run our Docker image locally with PostgreSQL, pgml, pgvector and NVIDIA drivers included.
|`most_recent`| The most recently trained model for this project is immediately deployed, regardless of metrics. |
32
37
|`best_score`| The model that achieved the best key metric score is immediately deployed. |
33
38
|`rollback`| The model that was deployed before to the current one is deployed. |
@@ -79,6 +84,8 @@ SELECT * FROM pgml.deploy(
79
84
(1 row)
80
85
```
81
86
87
+
88
+
82
89
### Rolling Back
83
90
84
91
In case the new model isn't performing well in production, it's easy to rollback to the previous version. A rollback creates a new deployment for the old model. Multiple rollbacks in a row will oscillate between the two most recently deployed models, making rollbacks a safe and reversible operation.
@@ -123,7 +130,7 @@ SELECT * FROM pgml.deploy(
123
130
124
131
### Specific Model IDs
125
132
126
-
In the case you need to deploy an exact model that is not the `most_recent` or `best_score`, you may deploy a model by id. Model id's can be found in the `pgml.models` table.
133
+
In the case you need to deploy an exact model that is not the `most_recent` or `best_score`, you may deploy a model by id. Model id's can be found in the `pgml.models` table.
Copy file name to clipboardExpand all lines: pgml-cms/docs/introduction/apis/sql-extensions/pgml.embed.md
+5Lines changed: 5 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -1,3 +1,8 @@
1
+
---
2
+
description: >-
3
+
Generate high quality embeddings with faster end-to-end vector operations without an additional vector database.
4
+
---
5
+
1
6
# pgml.embed()
2
7
3
8
Embeddings are a numeric representation of text. They are used to represent words and sentences as vectors, an array of numbers. Embeddings can be used to find similar pieces of text, by comparing the similarity of the numeric vectors using a distance measure, or they can be used as input features for other machine learning models, since most algorithms can't use text directly.
|`project_name`|`'Search Results Ranker'`| An easily recognizable identifier to organize your work. |
38
-
|`task`|`'regression'`| The objective of the experiment: `regression`, `classification` or `cluster`|
36
+
|`task`|`'regression'`| The objective of the experiment: `regression`, `classification` or `cluster`|
39
37
|`relation_name`|`'public.search_logs'`| The Postgres table or view where the training data is stored or defined. |
40
38
|`y_column_name`|`'clicked'`| The name of the label (aka "target" or "unknown") column in the training table. |
41
39
|`algorithm`|`'xgboost'`| <p>The algorithm to train on the dataset, see the task specific pages for available algorithms:<br><adata-mentionhref="regression.md">regression.md</a></p><p><adata-mentionhref="classification.md">classification.md</a><br><adata-mentionhref="clustering.md">clustering.md</a></p> |
Copy file name to clipboardExpand all lines: pgml-cms/docs/introduction/apis/sql-extensions/pgml.tune.md
+6-1Lines changed: 6 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,13 @@
1
+
---
2
+
description: >-
3
+
Fine tune open-source models on your own data.
4
+
---
5
+
1
6
# pgml.tune()
2
7
3
8
## Fine Tuning
4
9
5
-
Pre-trained models allow you to get up and running quickly, but you can likely improve performance on your dataset by fine tuning them. Normally, you'll bring your own data to the party, but for these examples we'll use datasets published on Hugging Face.
10
+
Pre-trained models allow you to get up and running quickly, but you can likely improve performance on your dataset by fine tuning them. Normally, you'll bring your own data to the party, but for these examples we'll use datasets published on Hugging Face. 
Copy file name to clipboardExpand all lines: pgml-cms/docs/resources/benchmarks/ggml-quantized-llm-support-for-huggingface-transformers.md
+8-1Lines changed: 8 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,11 @@
1
+
---
2
+
description: >-
3
+
Quantization allows PostgresML to fit larger models in less RAM.
4
+
---
1
5
# GGML Quantized LLM support for Huggingface Transformers
2
6
7
+
8
+
3
9
Quantization allows PostgresML to fit larger models in less RAM. These algorithms perform inference significantly faster on NVIDIA, Apple and Intel hardware. Half-precision floating point and quantized optimizations are now available for your favorite LLMs downloaded from Huggingface.
4
10
5
11
## Introduction
@@ -54,7 +60,8 @@ SELECT pgml.transform(
54
60
55
61
## Quantization
56
62
57
-
_Discrete quantization is not a new idea. It's been used by both algorithms and artists for more than a hundred years._\\
63
+
_Discrete quantization is not a new idea. It's been used by both algorithms and artists for more than a hundred years._\
64
+
58
65
59
66
Going beyond 16-bit down to 8 or 4 bits is possible, but not with hardware accelerated floating point operations. If we want hardware acceleration for smaller types, we'll need to use small integers w/ vectorized instruction sets. This is the process of _quantization_. Quantization can be applied to existing models trained with 32-bit floats, by converting the weights to smaller integer primitives that will still benefit from hardware accelerated instruction sets like Intel's [AVX](https://en.wikipedia.org/wiki/Advanced\_Vector\_Extensions). A simple way to quantize a model can be done by first finding the maximum and minimum values of the weights, then dividing the range of values into the number of buckets available in your integer type, 256 for 8-bit, 16 for 4-bit. This is called _post-training quantization_, and it's the simplest way to quantize a model.
0 commit comments