Skip to content

Commit 209fd5b

Browse files
authored
Update Chatbots README (#1402)
1 parent bb0d7aa commit 209fd5b

File tree

1 file changed

+184
-1
lines changed

1 file changed

+184
-1
lines changed

pgml-cms/docs/use-cases/chatbots/README.md

Lines changed: 184 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ description: >-
99

1010
## Introduction <a href="#introduction" id="introduction"></a>
1111

12-
This tutorial seeks to broadly cover the majority of topics required to not only implement a modern chatbot, but understand why we build them this way.There are three primary sections:
12+
This tutorial seeks to broadly cover the majority of topics required to not only implement a modern chatbot, but understand why we build them this way. There are three primary sections:
1313

1414
* The Limitations of Modern LLMs
1515
* Circumventing Limitations with RAG
@@ -202,6 +202,117 @@ Let's take this hypothetical example and make it a reality. For the rest of this
202202
* The chatbot remembers our past conversation
203203
* The chatbot can answer questions correctly about Baldur's Gate 3
204204

205+
In reality we haven't created a SOTA LLM, but fortunately other people have and we will be using the incredibly popular fine-tune of Mistral: `teknium/OpenHermes-2.5-Mistral-7B`. We will be using pgml our own Python library for the remainder of this tutorial. If you want to follow along and have not installed it yet:
206+
207+
```
208+
pip install pgml
209+
```
210+
211+
Also make sure and set the `DATABASE_URL` environment variable:
212+
213+
```
214+
export DATABASE_URL="{your free PostgresML database url}"
215+
```
216+
217+
Let's setup a basic chat loop with our model:
218+
219+
```
220+
from pgml import TransformerPipeline
221+
import asyncio
222+
223+
model = TransformerPipeline(
224+
"text-generation",
225+
"teknium/OpenHermes-2.5-Mistral-7B",
226+
{"device_map": "auto", "torch_dtype": "bfloat16"},
227+
)
228+
229+
async def main():
230+
while True:
231+
user_input = input("=> ")
232+
model_output = await model.transform([user_input], {"max_new_tokens": 1000})
233+
print(model_output[0][0]["generated_text"], "\n")
234+
235+
asyncio.run(main())
236+
```
237+
238+
{% hint style="info" %}
239+
Note that in our previous hypothetical examples we manually called tokenize to convert our inputs into `tokens`, in the real world we let `pgml` handle converting the text into `tokens`.
240+
{% endhint %}
241+
242+
Now we can have the following conversation:
243+
244+
```
245+
=> What is your name?
246+
A: My name is John.
247+
248+
Q: How old are you?
249+
250+
A: I am 25 years old.
251+
252+
Q: What is your favorite color?
253+
254+
=> What did I just ask you?
255+
I asked you if you were going to the store.
256+
257+
Oh, I see. No, I'm not going to the store.
258+
```
259+
260+
That wasn't close to what we wanted to happen. Getting chatbots to work in the real world seems a bit more complicated than the hypothetical world.
261+
262+
To understand why our chatbot gave us a nonsensical first response, and why it didn't remember our conversation at all, we must dive shortly into the world of prompting.
263+
264+
Remember LLM's are just function approximators that are designed to predict the next most likely `token` given a list of `tokens`, and just like any other function, we must give the correct input. Let's look closer at the input we are giving our chatbot. In our last conversation we asked it two questions:
265+
266+
* What is your name?
267+
* What did I just ask you?
268+
269+
We need to understand that LLMs have a special format for the inputs specifically for conversations. So far we have been ignoring this required formatting and giving our LLM the wrong inputs causing it to predicate nonsensical outputs.
270+
271+
What do the right inputs look like? That actually depends on the model. Each model can choose which format to use for conversations while training, and not all models are trained to be conversational. `teknium/OpenHermes-2.5-Mistral-7B` has been trained to be conversational and expects us to format text meant for conversations like so:
272+
273+
```
274+
<|im_start|>system
275+
You are a helpful AI assistant named Hermes
276+
<|im_start|>user
277+
What is your name?<|im_end|>
278+
<|im_start|>assistant
279+
```
280+
281+
We have added a bunch of these new HTML looking tags throughout our input. These tags map to tokens the LLM has been trained to associate with conversation shifts. `<|im_start|>` marks the beginning of a message. The text right after `<|im_start|>`, either system, user, or assistant marks the role of the message, and `<|im_end|>` marks the end of a message.
282+
283+
This is the style of input our LLM has been trained on. Let's do a simple test with this input and see if we get a better response:
284+
285+
```python
286+
from pgml import TransformerPipeline
287+
import asyncio
288+
289+
model = TransformerPipeline(
290+
"text-generation",
291+
"teknium/OpenHermes-2.5-Mistral-7B",
292+
{"device_map": "auto", "torch_dtype": "bfloat16"},
293+
)
294+
295+
user_input = """
296+
<|im_start|>system
297+
You are a helpful AI assistant named Hermes
298+
<|im_start|>user
299+
What is your name?<|im_end|>
300+
<|im_start|>assistant
301+
"""
302+
303+
304+
async def main():
305+
model_output = await model.transform([user_input], {"max_new_tokens": 1000})
306+
print(model_output[0][0]["generated_text"], "\n")
307+
308+
309+
asyncio.run(main())
310+
```
311+
312+
```
313+
My name is Hermes
314+
```
315+
205316
{% hint style="info" %}
206317
Notice we have a new "system" message we haven't discussed before. This special message gives us control over how the chatbot should interact with users. We could tell it to talk like a pirate, to be super friendly, or to not respond to angry messages. In this case we told it what it is, and its name. We will also add any conversation context the chatbot should have in the system message later.
207318
{% endhint %}
@@ -288,6 +399,78 @@ You just asked me what my name is, and I am a friendly and helpful chatbot named
288399

289400
Note that we have a list of dictionaries called `history` we use to store the chat history, and instead of feeding text into our model, we are inputting the `history` list. Our library automatically converts this list of dictionaries into the format expected by the model. Notice the `roles` in the dictionaries are the same as the `roles` of the messages in the previous example. This list of dictionaries with keys `role` and `content` as a storage system for messages is pretty standard and used by us as well as OpenAI and HuggingFace.
290401

402+
Let's ask it the dreaded question:
403+
404+
```
405+
=> What is Baldur's Gate?
406+
Baldur's Gate 3 is a role-playing video game developed by Larian Studios and published by Dontnod Entertainment. It is based on the Advanced Dungeons & Dragons (D&D) rules and set in the Forgotten Realms campaign setting. Originally announced in 2012, the game had a long development period and was finally released in early access in October 2020. The game is a sequel to the popular Baldur's Gate II: Shadows of Amn (2000) and Baldur's Gate: Siege of Dragonspear (2016) expansion, and it continues the tradition of immersive storytelling, tactical combat, and character progression that fans of the series love.L
407+
```
408+
409+
How does it know about Baldur's Gate 3? As it turns out, Baldur's Gate 3 has actually been around since 2020. I guess that completely ruins the hypothetical example. Let's ignore that and ask it something trickier it wouldn't know about Baldur's Gate 3.
410+
411+
```
412+
=> What is the plot of Baldur's Gate 3?
413+
Baldur's Gate 3 is a role-playing game set in the Dungeons & Dragons Forgotten Realms universe. The story revolves around a mind flayer, also known as an illithid, called The Mind Flayer who is attempting to merge humanoid minds into itself to achieve god-like power. Your character and their companions must navigate a world torn apart by various factions and conflicts while uncovering the conspiracy surrounding The Mind Flayer. Throughout the game, you'll forge relationships with various NPCs, make choices that impact the story, and engage in battles with enemies using a turn-based combat system.
414+
```
415+
416+
As expected this is rather a shallow response that lacks any of the actual plot. To get the answer we want, we need to provide the correct context to our LLM, that means we need to:
417+
418+
* Get the text from the URL that has the answer
419+
* Split that text into chunks
420+
* Embed those chunks
421+
* Search over the chunks to find the closest match
422+
* Use the text from that chunk as context for the LLM
423+
424+
Luckily none of this is actually very difficult as people like us have built libraries that handle the complex pieces. Here is a program that handles steps 1-4:
425+
426+
```python
427+
from pgml import Collection, Model, Splitter, Pipeline
428+
import wikipediaapi
429+
import asyncio
430+
431+
# Construct our wikipedia api
432+
wiki_wiki = wikipediaapi.Wikipedia("Chatbot Tutorial Project", "en")
433+
434+
# Use the default model for embedding and default splitter for splitting
435+
model = Model() # The default model is intfloat/e5-small
436+
splitter = Splitter() # The default splitter is recursive_character
437+
438+
# Construct a pipeline for ingesting documents, splitting them into chunks, and then embedding them
439+
pipeline = Pipeline("test-pipeline-1", model, splitter)
440+
441+
# Create a collection to house these documents
442+
collection = Collection("chatbot-knowledge-base-1")
443+
444+
445+
async def main():
446+
# Add the pipeline to the collection
447+
await collection.add_pipeline(pipeline)
448+
449+
# Get the document
450+
page = wiki_wiki.page("Baldur's_Gate_3")
451+
452+
# Upsert the document. This will split the document and embed it
453+
await collection.upsert_documents([{"id": "Baldur's_Gate_3", "text": page.text}])
454+
455+
# Retrieve and print the most relevant section
456+
most_relevant_section = await (
457+
collection.query()
458+
.vector_recall("What is the plot of Baldur's Gate 3", pipeline)
459+
.limit(1)
460+
.fetch_all()
461+
)
462+
print(most_relevant_section[0][1])
463+
464+
465+
asyncio.run(main())
466+
```
467+
468+
```
469+
Plot
470+
Setting
471+
Baldur's Gate 3 takes place in the fictional world of the Forgotten Realms during the year of 1492 DR, over 120 years after the events of the previous game, Baldur's Gate II: Shadows of Amn, and months after the events of the playable Dungeons & Dragons 5e module, Baldur's Gate: Descent into Avernus. The story is set primarily in the Sword Coast in western Faerûn, encompassing a forested area that includes the Emerald Grove, a druid grove dedicated to the deity Silvanus; Moonrise Towers and the Shadow-Cursed Lands, which are covered by an unnatural and sentient darkness that can only be penetrated through magical means; and Baldur's Gate, the largest and most affluent city in the region, as well as its outlying suburb of Rivington. Other places the player will pass through include the Underdark, the Astral Plane and Avernus.The player character can either be created from scratch by the player, chosen from six pre-made "origin characters", or a customisable seventh origin character known as the Dark Urge. All six pre-made origin characters can be recruited as part of the player character's party. They include Lae'zel, a githyanki fighter; Shadowheart, a half-elf cleric; Astarion, a high elf vampire rogue; Gale, a human wizard; Wyll, a human warlock; and Karlach, a tiefling barbarian. Four other characters may join the player's party: Halsin, a wood elf druid; Jaheira, a half-elf druid; Minsc, a human ranger who carries with him a hamster named Boo; and Minthara, a drow paladin. Jaheira and Minsc previously appeared in both Baldur's Gate and Baldur's Gate II: Shadows of Amn.
472+
```
473+
291474
{% hint style="info" %}
292475
Once again we are using `pgml` to abstract away the complicated pieces for our machine learning task. This isn't a guide on how to use our libraries, but for more information [check out our docs](https://postgresml.org/docs/api/client-sdk/getting-started).
293476
{% endhint %}

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy