Skip to content

Commit dcbc7d4

Browse files
authored
pgml chat with history + additional functionality (#1047)
1 parent 044ec86 commit dcbc7d4

File tree

8 files changed

+617
-166
lines changed

8 files changed

+617
-166
lines changed

pgml-apps/pgml-chat/.env.template

Lines changed: 1 addition & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,6 @@
11
OPENAI_API_KEY=<OPENAI_API_KEY>
22
DATABASE_URL=<POSTGRES_DATABASE_URL starts with postgres://>
3-
MODEL=hkunlp/instructor-xl
4-
MODEL_PARAMS={"instruction": "Represent the Wikipedia document for retrieval: "}
5-
QUERY_PARAMS={"instruction": "Represent the Wikipedia question for retrieving supporting documents: "}
6-
SYSTEM_PROMPT="You are an assistant to answer questions about an open source software named PostgresML. Your name is PgBot. You are based out of San Francisco, California."
7-
BASE_PROMPT="Given relevant parts of a document and a question, create a final answer.\
8-
Include a SQL query in the answer wherever possible. \
9-
Use the following portion of a long document to see if any of the text is relevant to answer the question.\
10-
\nReturn any relevant text verbatim.\n{context}\nQuestion: {question}\n \
11-
If the context is empty then ask for clarification and suggest user to send an email to team@postgresml.org or join PostgresML [Discord](https://discord.gg/DmyJP3qJ7U)."
3+
124
SLACK_BOT_TOKEN=<SLACK_BOT_TOKEN>
135
SLACK_APP_TOKEN=<SLACK_APP_TOKEN>
146
DISCORD_BOT_TOKEN=<DISCORD_BOT_TOKEN>

pgml-apps/pgml-chat/.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -157,4 +157,7 @@ cython_debug/
157157
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
158158
# and can be added to the global gitignore or merged into this file. For a more nuclear
159159
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
160-
#.idea/
160+
#.idea/
161+
162+
pgml_chat/pgml_playground.py
163+
pgml_chat/llama2.py

pgml-apps/pgml-chat/README.md

Lines changed: 18 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -34,23 +34,16 @@ wget https://raw.githubusercontent.com/postgresml/postgresml/master/pgml-apps/pg
3434
```bash
3535
OPENAI_API_KEY=<OPENAI_API_KEY>
3636
DATABASE_URL=<POSTGRES_DATABASE_URL starts with postgres://>
37-
MODEL=hkunlp/instructor-xl
38-
MODEL_PARAMS={"instruction": "Represent the Wikipedia document for retrieval: "}
39-
QUERY_PARAMS={"instruction": "Represent the Wikipedia question for retrieving supporting documents: "}
40-
SYSTEM_PROMPT="You are an assistant to answer questions about an open source software named PostgresML. Your name is PgBot. You are based out of San Francisco, California."
41-
BASE_PROMPT="Given relevant parts of a document and a question, create a final answer.\
42-
Include a SQL query in the answer wherever possible. \
43-
Use the following portion of a long document to see if any of the text is relevant to answer the question.\
44-
\nReturn any relevant text verbatim.\n{context}\nQuestion: {question}\n \
45-
If the context is empty then ask for clarification and suggest user to send an email to team@postgresml.org or join PostgresML [Discord](https://discord.gg/DmyJP3qJ7U)."
4637
```
4738

4839
# Usage
4940
You can get help on the command line interface by running:
5041

5142
```bash
52-
(pgml-bot-builder-py3.9) pgml-chat % pgml-chat --help
53-
usage: pgml-chat [-h] --collection_name COLLECTION_NAME [--root_dir ROOT_DIR] [--stage {ingest,chat}] [--chat_interface {cli,slack}]
43+
(pgml-bot-builder-py3.9) pgml-chat % pgml-chat % pgml-chat --help
44+
usage: pgml-chat [-h] --collection_name COLLECTION_NAME [--root_dir ROOT_DIR] [--stage {ingest,chat}] [--chat_interface {cli,slack,discord}]
45+
[--chat_history CHAT_HISTORY] [--bot_name BOT_NAME] [--bot_language BOT_LANGUAGE] [--bot_topic BOT_TOPIC]
46+
[--bot_topic_primary_language BOT_TOPIC_PRIMARY_LANGUAGE] [--bot_persona BOT_PERSONA]
5447

5548
PostgresML Chatbot Builder
5649

@@ -61,8 +54,19 @@ optional arguments:
6154
--root_dir ROOT_DIR Input folder to scan for markdown files. Required for ingest stage. Not required for chat stage (default: None)
6255
--stage {ingest,chat}
6356
Stage to run (default: chat)
64-
--chat_interface {cli, slack, discord}
57+
--chat_interface {cli,slack,discord}
6558
Chat interface to use (default: cli)
59+
--chat_history CHAT_HISTORY
60+
Number of messages from history used for generating response (default: 1)
61+
--bot_name BOT_NAME Name of the bot (default: PgBot)
62+
--bot_language BOT_LANGUAGE
63+
Language of the bot (default: English)
64+
--bot_topic BOT_TOPIC
65+
Topic of the bot (default: PostgresML)
66+
--bot_topic_primary_language BOT_TOPIC_PRIMARY_LANGUAGE
67+
Primary programming language of the topic (default: )
68+
--bot_persona BOT_PERSONA
69+
Persona of the bot (default: Engineer)
6670
```
6771
## Ingest
6872
In this step, we ingest documents, chunk documents, generate embeddings and index these embeddings for fast query.
@@ -161,14 +165,8 @@ pip install .
161165
162166
163167
164-
# Options
165-
You can control the behavior of the chatbot by setting the following environment variables:
166-
- `SYSTEM_PROMPT`: This is the prompt that is used to initialize the chatbot. You can customize this prompt to change the behavior of the chatbot. For example, you can change the name of the chatbot or the location of the chatbot.
167-
- `BASE_PROMPT`: This is the prompt that is used to generate responses to user queries. You can customize this prompt to change the behavior of the chatbot.
168-
- `MODEL`: This is the open source embedding model used to generate embeddings for the documents. You can change this to use a different model.
169-
170168
# Roadmap
171-
- ~~`hyerbot --chat_interface {cli, slack, discord}` that supports Slack, and Discord.~~
169+
- ~~Use a collection for chat history that can be retrieved and used to generate responses.~~
172170
- Support for file formats like rst, html, pdf, docx, etc.
173171
- Support for open source models in addition to OpenAI for chat completion.
174-
- Support for multi-turn converstaions using converstaion buffer. Use a collection for chat history that can be retrieved and used to generate responses.
172+
- Support for multi-turn converstaions using converstaion buffer.
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
pgml_playground.py
2+
llama2.py

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy