postgresml · SilasMarvin · Jul 17, 2024 · Jul 17, 2024 · Jul 17, 2024
diff --git a/pgml-cms/docs/.gitbook/assets/Chatbots_Flow-Diagram.svg b/pgml-cms/docs/.gitbook/assets/Chatbots_Flow-Diagram.svg
diff --git a/pgml-cms/docs/.gitbook/assets/Chatbots_King-Diagram.svg b/pgml-cms/docs/.gitbook/assets/Chatbots_King-Diagram.svg
diff --git a/pgml-cms/docs/.gitbook/assets/Chatbots_Limitations-Diagram.svg b/pgml-cms/docs/.gitbook/assets/Chatbots_Limitations-Diagram.svg
diff --git a/pgml-cms/docs/.gitbook/assets/Chatbots_Tokens-Diagram.svg b/pgml-cms/docs/.gitbook/assets/Chatbots_Tokens-Diagram.svg
diff --git a/pgml-cms/docs/.gitbook/assets/chatbot_flow.png b/pgml-cms/docs/.gitbook/assets/chatbot_flow.png
diff --git a/pgml-cms/docs/.gitbook/assets/embedding_king.png b/pgml-cms/docs/.gitbook/assets/embedding_king.png
diff --git a/pgml-cms/docs/.gitbook/assets/embeddings_tokens.png b/pgml-cms/docs/.gitbook/assets/embeddings_tokens.png
diff --git a/pgml-cms/docs/guides/chatbots/README.md b/pgml-cms/docs/guides/chatbots/README.md
@@ -30,7 +30,7 @@ Here is an example flowing from:
 
 text -> tokens -> LLM -> probability distribution -> predicted token -> text
 
-<figure><img src="https://files.gitbook.com/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FrvfCoPdoQeoovZiqNG90%2Fuploads%2FPzJzmVS3uNhbvseiJbgi%2FScreenshot%20from%202023-12-13%2013-19-33.png?alt=media&#x26;token=11d57b2a-6aa3-4374-b26c-afc6f531d2f3" alt=""><figcaption><p>The flow of inputs through an LLM. In this case the inputs are "What is Baldur's Gate 3?" and the output token "14" maps to the word "I"</p></figcaption></figure>
+<figure><img src="../../.gitbook/assets/Chatbots_Limitations-Diagram.svg" alt=""><figcaption><p>The flow of inputs through an LLM. In this case the inputs are "What is Baldur's Gate 3?" and the output token "14" maps to the word "I"</p></figcaption></figure>
 
 {% hint style="info" %}
 We have simplified the tokenization process. Words do not always map directly to tokens. For instance, the word "Baldur's" may actually map to multiple tokens. For more information on tokenization checkout [HuggingFace's summary](https://huggingface.co/docs/transformers/tokenizer\_summary).
@@ -108,11 +108,11 @@ What does an `embedding` look like? `Embeddings` are just vectors (for our use c
 embedding_1 = embed("King") # embed returns something like [0.11, -0.32, 0.46, ...]
 ```
 
-<figure><img src="../../.gitbook/assets/embedding_king.png" alt=""><figcaption><p>The flow of word -> token -> embedding</p></figcaption></figure>
+<figure><img src="../../.gitbook/assets/Chatbots_King-Diagram.svg" alt=""><figcaption><p>The flow of word -> token -> embedding</p></figcaption></figure>
 
 `Embeddings` aren't limited to words, we have models that can embed entire sentences.
 
-<figure><img src="../../.gitbook/assets/embeddings_tokens.png" alt=""><figcaption><p>The flow of sentence -> tokens -> embedding</p></figcaption></figure>
+<figure><img src="../../.gitbook/assets/Chatbots_Tokens-Diagram.svg" alt=""><figcaption><p>The flow of sentence -> tokens -> embedding</p></figcaption></figure>
 
 Why do we care about `embeddings`? `Embeddings` have a very interesting property. Words and sentences that have close [semantic similarity](https://en.wikipedia.org/wiki/Semantic\_similarity) sit closer to one another in vector space than words and sentences that do not have close semantic similarity.
 
@@ -157,7 +157,7 @@ print(context)
 
 There is a lot going on with this, let's check out this diagram and step through it.
 
-<figure><img src="../../.gitbook/assets/chatbot_flow.png" alt=""><figcaption><p>The flow of taking a document, splitting it into chunks, embedding those chunks, and then retrieving a chunk based off of a users query</p></figcaption></figure>
+<figure><img src="../../.gitbook/assets/Chatbots_Flow-Diagram.svg" alt=""><figcaption><p>The flow of taking a document, splitting it into chunks, embedding those chunks, and then retrieving a chunk based off of a users query</p></figcaption></figure>
 
 Step 1: We take the document and split it into chunks. Chunks are typically a paragraph or two in size. There are many ways to split documents into chunks, for more information check out [this guide](https://www.pinecone.io/learn/chunking-strategies/).
 

diff --git a/pgml-cms/docs/introduction/getting-started/import-your-data/README.md b/pgml-cms/docs/introduction/getting-started/import-your-data/README.md
@@ -12,11 +12,11 @@ Just like any PostgreSQL database, PostgresML can be configured as the primary a
 
 If your intention is to use PostgresML as your primary database, your job here is done. You can use the connection credentials provided and start building your application on top of in-database AI right away.
 
-## [Logical replica](logical-replication/)
+## [Logical replication](logical-replication/)
 
 If your primary database is hosted elsewhere, for example AWS RDS, or Azure Postgres, you can get your data replicated to PostgresML in real time using logical replication. 
 
-<figure class="my-3 py-3"><img src="../../../.gitbook/assets/logical_replication_1.png" alt="Logical replication" width="80%"><figcaption></figcaption></figure>
+<figure class="my-3 py-3"><img src="../../../.gitbook/assets/Getting-Started_Logical-Replication-Diagram.svg" alt="Logical replication" width="80%"><figcaption></figcaption></figure>
 
 Having access to your data immediately is very useful to
 accelerate your machine learning use cases and removes the need for moving data multiple times between microservices. Latency-sensitive applications should consider using this approach.
@@ -25,7 +25,7 @@ accelerate your machine learning use cases and removes the need for moving data
 
 Foreign data wrappers are a set of PostgreSQL extensions that allow making direct connections from inside the database directly to other databases, even if they aren't running on Postgres. For example, Postgres has foreign data wrappers for MySQL, S3, Snowflake and many others.
 
-<figure class="my-3 py-3"><img src="../../../.gitbook/assets/fdw_1.png" alt="Foreign data wrappers" width="80%"><figcaption></figcaption></figure>
+<figure class="my-3 py-3"><img src="../../../.gitbook/assets/Getting-Started_FDW-Diagram.svg" alt="Foreign data wrappers" width="80%"><figcaption></figcaption></figure>
 
 FDWs are useful when data access is infrequent and not latency-sensitive. For many use cases, like offline batch workloads and not very busy websites, this approach is suitable and easy to get started with.