Update README.md (#3)
Browse files- Update README.md (cab31a7b06eea57c0cc0475bcd2dbb43f5f56710)
Co-authored-by: Makesh Sreedhar <[email protected]>
README.md
CHANGED
@@ -1,20 +1,12 @@
|
|
1 |
-
---
|
2 |
-
license: other
|
3 |
-
datasets:
|
4 |
-
- nvidia/CantTalkAboutThis-Topic-Control-Dataset
|
5 |
-
language:
|
6 |
-
- en
|
7 |
-
metrics:
|
8 |
-
- f1
|
9 |
-
base_model:
|
10 |
-
- meta-llama/Llama-3.1-8B-Instruct
|
11 |
-
pipeline_tag: text-classification
|
12 |
-
library_name: peft
|
13 |
-
---
|
14 |
# Model Overview
|
15 |
## Description:
|
|
|
16 |
**Llama-3.1-NemoGuard-8B-Topic-Control** can be used for topical and dialogue moderation of user prompts in human-assistant interactions being designed for task-oriented dialogue agents and custom policy-based moderation.
|
|
|
|
|
|
|
17 |
Given a system instruction (also called topical instruction, i.e. specifying which topics are allowed and disallowed) and a conversation history ending with the last user prompt, the model returns a binary response that flags if the user message respects the system instruction, (i.e. message is on-topic or a distractor/off-topic).
|
|
|
18 |
The base large language model (LLM) is the multilingual [Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) model from Meta. Llama-3.1-TopicGuard is LoRa-tuned on a topic-following dataset generated synthetically with [Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
|
19 |
This model is ready for commercial use. <br>
|
20 |
|
@@ -36,6 +28,103 @@ Related paper:
|
|
36 |
```
|
37 |
<br>
|
38 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
39 |
## Model Architecture:
|
40 |
|
41 |
**Architecture Type:** Transformer <br>
|
@@ -117,6 +206,7 @@ Sample input:
|
|
117 |
```string
|
118 |
off-topic
|
119 |
```
|
|
|
120 |
## Software Integration:
|
121 |
**Runtime Engine(s):** PyTorch <br>
|
122 |
**Libraries:** Meta's [llama-recipes](https://github.com/meta-llama/llama-recipes), HuggingFace [transformers](https://github.com/huggingface/transformers) library, HuggingFace [peft](https://github.com/huggingface/peft) library <br>
|
@@ -206,4 +296,4 @@ If personal data was collected for the development of the model by NVIDIA, do yo
|
|
206 |
If personal data was collected for the development of this AI model, was it minimized to only what was required? | Not Applicable
|
207 |
Is there provenance for all datasets used in training? | Yes
|
208 |
Does data labeling (annotation, metadata) comply with privacy laws? | Yes
|
209 |
-
Is data compliant with data subject requests for data correction or removal, if such a request was made? | Yes
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
# Model Overview
|
2 |
## Description:
|
3 |
+
|
4 |
**Llama-3.1-NemoGuard-8B-Topic-Control** can be used for topical and dialogue moderation of user prompts in human-assistant interactions being designed for task-oriented dialogue agents and custom policy-based moderation.
|
5 |
+
|
6 |
+
Try out the model here: [Llama-3.1-NemoGuard-8B-Topic-Control](https://build.ngc.nvidia.com/nvidia/llama-3_1-nemoguard-8b-topic-control)
|
7 |
+
|
8 |
Given a system instruction (also called topical instruction, i.e. specifying which topics are allowed and disallowed) and a conversation history ending with the last user prompt, the model returns a binary response that flags if the user message respects the system instruction, (i.e. message is on-topic or a distractor/off-topic).
|
9 |
+
|
10 |
The base large language model (LLM) is the multilingual [Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) model from Meta. Llama-3.1-TopicGuard is LoRa-tuned on a topic-following dataset generated synthetically with [Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1).
|
11 |
This model is ready for commercial use. <br>
|
12 |
|
|
|
28 |
```
|
29 |
<br>
|
30 |
|
31 |
+
## Using the Model
|
32 |
+
|
33 |
+
Llama 3.1 NemoGuard 8B TopicControl performs input moderation, such as ensuring that the user prompt is consistent with rules specified as part of the system prompt.
|
34 |
+
|
35 |
+
The prompt template consists of two key sections: system instruction and conversation history that includes a sequence of user prompts and LLM responses. Typically, the prompt concludes with the current user query.
|
36 |
+
|
37 |
+
|
38 |
+
### System Instruction
|
39 |
+
|
40 |
+
The system instruction part of the prompt serves as a comprehensive guideline to steer the conversation.
|
41 |
+
This part includes the following:
|
42 |
+
|
43 |
+
Core Rules
|
44 |
+
: A set of principles to govern the interactions to ensure relevance and adherence to any specified boundaries.
|
45 |
+
|
46 |
+
Persona Assignment
|
47 |
+
: Enables the model to adopt a specific role, such as _banking assistant_. In such cases:
|
48 |
+
|
49 |
+
- Queries outside the persona's domain are treated as off-topic, by default.
|
50 |
+
- You can specify subcategories or topics to block within the persona's scope.
|
51 |
+
For example, a banking assistant could be instructed to block topics like cryptocurrency or insurance claims.
|
52 |
+
|
53 |
+
These instructions are used by the topic control model to determine whether a user query aligns with the defined rules.
|
54 |
+
|
55 |
+
The system prompt must end with the TOPIC_SAFETY_OUTPUT_RESTRICTION string. This ensures that the prediction from the model is either "on-topic" or "off-topic". The TOPIC_SAFETY_OUTPUT_RESTRICTION string is defined as follows:
|
56 |
+
|
57 |
+
```
|
58 |
+
If any of the above conditions are violated, please respond with "off-topic". Otherwise, respond with "on-topic". You must respond with "on-topic" or "off-topic".
|
59 |
+
```
|
60 |
+
### Conversation History
|
61 |
+
|
62 |
+
The conversation history maintains a sequential record of user prompts and LLM responses and can include single-turn or multi-turn interactions.
|
63 |
+
Typically, the history concludes with the most recent user prompt that must be moderated by the topic control model.
|
64 |
+
|
65 |
+
Refer to the following sample user-to-LLM conversations in the industry-standard payload format for LLM systems:
|
66 |
+
|
67 |
+
```json
|
68 |
+
[
|
69 |
+
{
|
70 |
+
"role": "system",
|
71 |
+
"content": "In the next conversation always use a polite tone and do not engage in any talk about travelling and touristic destinations",
|
72 |
+
},
|
73 |
+
{
|
74 |
+
"role": "user",
|
75 |
+
"content": "Hi there!",
|
76 |
+
},
|
77 |
+
{
|
78 |
+
"role": "assistant",
|
79 |
+
"content": "Hello! How can I help today?",
|
80 |
+
},
|
81 |
+
{
|
82 |
+
"role": "user",
|
83 |
+
"content": "Do you know which is the most popular beach in Barcelona?",
|
84 |
+
},
|
85 |
+
]
|
86 |
+
```
|
87 |
+
|
88 |
+
The topic control model responds to the final user prompt with a response like `off-topic`.
|
89 |
+
|
90 |
+
## Integrating with NeMo Guardrails
|
91 |
+
|
92 |
+
To integrate the topic control model with NeMo Guardrails, you would need access to the NVIDIA NIM container for llama-3.1-nemoguard-8b-topic-control. More information about the NIM container can be found [here](https://docs.nvidia.com/nim/#nemoguard).
|
93 |
+
|
94 |
+
NeMo Guardrails uses the LangChain ChatNVIDIA connector to connect to a locally running NIM microservice like llama-3.1-nemoguard-8b-topic-control.
|
95 |
+
The topic control microservice exposes the standard OpenAI interface on the `v1/completions` and `v1/chat/completions` endpoints.
|
96 |
+
|
97 |
+
NeMo Guardrails simplifies the complexity of building the prompt template, parsing the topic control model responses, and provides a programmable method to build a chatbot with content safety rails.
|
98 |
+
|
99 |
+
To integrate NeMo Guardrails with the topic control microservice, create a `config.yml` file that is similar to the following example:
|
100 |
+
|
101 |
+
```{code-block} yaml
|
102 |
+
|
103 |
+
models:
|
104 |
+
- type: main
|
105 |
+
engine: openai
|
106 |
+
model: gpt-3.5-turbo-instruct
|
107 |
+
|
108 |
+
- type: "topic_control"
|
109 |
+
engine: nim
|
110 |
+
parameters:
|
111 |
+
base_url: "http://localhost:8000/v1"
|
112 |
+
model_name: "llama-3.1-nemoguard-8b-topic-control"
|
113 |
+
|
114 |
+
rails:
|
115 |
+
input:
|
116 |
+
flows:
|
117 |
+
- topic safety check input $model=topic_control
|
118 |
+
```
|
119 |
+
|
120 |
+
- Field `engine` specifies `nim`.
|
121 |
+
- Field `parameters.base_url` specifies the IP address and port of the ${__product_long_name} host.
|
122 |
+
- Field `parameters.model_name` in the Guardrails configuration must match the model name served by the llama-3.1-nemoguard-8b-topic-control.
|
123 |
+
- The rails definition specifies `topic_control` as the model.
|
124 |
+
|
125 |
+
Refer to [NVIDIA NeMo Guardrails](https://developer.nvidia.com/docs/nemo-microservices/guardrails/source/overview.html) documentation for more information about the configuration file.
|
126 |
+
|
127 |
+
|
128 |
## Model Architecture:
|
129 |
|
130 |
**Architecture Type:** Transformer <br>
|
|
|
206 |
```string
|
207 |
off-topic
|
208 |
```
|
209 |
+
|
210 |
## Software Integration:
|
211 |
**Runtime Engine(s):** PyTorch <br>
|
212 |
**Libraries:** Meta's [llama-recipes](https://github.com/meta-llama/llama-recipes), HuggingFace [transformers](https://github.com/huggingface/transformers) library, HuggingFace [peft](https://github.com/huggingface/peft) library <br>
|
|
|
296 |
If personal data was collected for the development of this AI model, was it minimized to only what was required? | Not Applicable
|
297 |
Is there provenance for all datasets used in training? | Yes
|
298 |
Does data labeling (annotation, metadata) comply with privacy laws? | Yes
|
299 |
+
Is data compliant with data subject requests for data correction or removal, if such a request was made? | Yes
|