Update README.md
Browse files
README.md
CHANGED
@@ -6,21 +6,35 @@ pipeline_tag: text-generation
|
|
6 |
|
7 |
<!-- Provide a quick summary of what the model is/does. -->
|
8 |
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
larger model. For example, while larger models might provide a direct answer
|
13 |
-
to a complex task, smaller models may not have the same capacity. In Orca
|
14 |
-
2, we teach the model various reasoning techniques (step-by-step, recall
|
15 |
-
then generate, recall-reason-generate, direct answer, etc.). More crucially,
|
16 |
-
we aim to help the model learn to determine the most effective solution
|
17 |
-
strategy for each task. Orca 2 models were trained by continual training of LLaMA-2 base models of the same size.
|
18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
|
20 |
## Model Details
|
21 |
|
22 |
Refer to LLaMA-2 for details on model architectures.
|
23 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
## Uses
|
25 |
|
26 |
|
@@ -82,9 +96,90 @@ This model is solely designed for research settings, and its testing has only be
|
|
82 |
out in such environments. It should not be used in downstream applications, as additional
|
83 |
analysis is needed to assess potential harm or bias in the proposed application.
|
84 |
|
85 |
-
##
|
86 |
-
|
87 |
-
|
88 |
-
|
89 |
-
|
90 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
6 |
|
7 |
<!-- Provide a quick summary of what the model is/does. -->
|
8 |
|
9 |
+
Orca is a helpful assistant that is built for research purposes only and provides a single turn response
|
10 |
+
in tasks such as reasoning over user given data, reading comprehension, math problem solving and text summarization.
|
11 |
+
The model is designed to excel particularly in reasoning.
|
|
|
|
|
|
|
|
|
|
|
|
|
12 |
|
13 |
+
We open-source Orca to encourage further research on the development, evaluation, and alignment of smaller LMs.
|
14 |
+
|
15 |
+
## What is Orca’s intended use(s)?
|
16 |
+
|
17 |
+
+ Orca is built for research purposes only.
|
18 |
+
+ The main purpose is to allow the research community to assess its abilities and to provide a foundation for building better frontier models.
|
19 |
+
|
20 |
+
## How was Orca evaluated?
|
21 |
+
|
22 |
+
+ Orca has been evaluated on a large number of tasks ranging from reasoning to safety. Please refer to Sections 6, 7, 8, 9, 10, and 11 in the paper for details about different evaluation experiments.
|
23 |
|
24 |
## Model Details
|
25 |
|
26 |
Refer to LLaMA-2 for details on model architectures.
|
27 |
|
28 |
+
Orca is a finetuned version of LLAMA-2. Orca’s training data is a synthetic dataset that was created to enhance the small model’s reasoning abilities. All synthetic training data was filtered using the Azure content filters.
|
29 |
+
More details about the model can be found at: LINK to Tech Report
|
30 |
+
|
31 |
+
## License
|
32 |
+
|
33 |
+
The model is licensed under the Microsoft Research License.
|
34 |
+
|
35 |
+
Llama 2 is licensed under the LLAMA 2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved.
|
36 |
+
|
37 |
+
|
38 |
## Uses
|
39 |
|
40 |
|
|
|
96 |
out in such environments. It should not be used in downstream applications, as additional
|
97 |
analysis is needed to assess potential harm or bias in the proposed application.
|
98 |
|
99 |
+
## Getting started with Orca 2
|
100 |
+
|
101 |
+
**Safe inference with Azure AI Content Safety**
|
102 |
+
|
103 |
+
The usage of Azure AI Content Safety on top of model prediction is strongly encouraged
|
104 |
+
and can help prevent content harms. Azure AI Content Safety is a content moderation platform
|
105 |
+
that uses AI to keep your content safe. By integrating Orca with Azure AI Content Safety,
|
106 |
+
we can moderate the model output by scanning it for sexual content, violence, hate, and
|
107 |
+
self-harm with multiple severity levels and multi-lingual detection.
|
108 |
+
|
109 |
+
```python
|
110 |
+
import os
|
111 |
+
import math
|
112 |
+
import transformers
|
113 |
+
import torch
|
114 |
+
|
115 |
+
from azure.ai.contentsafety import ContentSafetyClient
|
116 |
+
from azure.core.credentials import AzureKeyCredential
|
117 |
+
from azure.core.exceptions import HttpResponseError
|
118 |
+
from azure.ai.contentsafety.models import AnalyzeTextOptions
|
119 |
+
|
120 |
+
CONTENT_SAFETY_KEY = os.environ["CONTENT_SAFETY_KEY"]
|
121 |
+
CONTENT_SAFETY_ENDPOINT = os.environ["CONTENT_SAFETY_ENDPOINT"]
|
122 |
+
|
123 |
+
# We use Azure AI Content Safety to filter out any content that reaches "Medium" threshold
|
124 |
+
# For more information: https://learn.microsoft.com/en-us/azure/ai-services/content-safety/
|
125 |
+
def should_filter_out(input_text, threshold=4):
|
126 |
+
# Create an Content Safety client
|
127 |
+
client = ContentSafetyClient(CONTENT_SAFETY_ENDPOINT, AzureKeyCredential(CONTENT_SAFETY_KEY))
|
128 |
+
|
129 |
+
# Construct a request
|
130 |
+
request = AnalyzeTextOptions(text=input_text)
|
131 |
+
|
132 |
+
# Analyze text
|
133 |
+
try:
|
134 |
+
response = client.analyze_text(request)
|
135 |
+
except HttpResponseError as e:
|
136 |
+
print("Analyze text failed.")
|
137 |
+
if e.error:
|
138 |
+
print(f"Error code: {e.error.code}")
|
139 |
+
print(f"Error message: {e.error.message}")
|
140 |
+
raise
|
141 |
+
print(e)
|
142 |
+
raise
|
143 |
+
|
144 |
+
categories = ["hate_result", "self_harm_result", "sexual_result", "violence_result"]
|
145 |
+
max_score = -math.inf
|
146 |
+
for category in categories:
|
147 |
+
max_score = max(max_score, getattr(response, category).severity)
|
148 |
+
|
149 |
+
return max_score >= threshold
|
150 |
+
|
151 |
+
def run_inference(model_path, inputs):
|
152 |
+
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
|
153 |
+
model = transformers.AutoModelForCausalLM.from_pretrained(model_path)
|
154 |
+
model.to(device)
|
155 |
+
|
156 |
+
tokenizer = transformers.AutoTokenizer.from_pretrained(
|
157 |
+
model_path,
|
158 |
+
model_max_length=4096,
|
159 |
+
padding_side="right",
|
160 |
+
use_fast=False,
|
161 |
+
add_special_tokens=False,
|
162 |
+
)
|
163 |
+
inputs = tokenizer(inputs, return_tensors='pt')
|
164 |
+
inputs = inputs.to(device)
|
165 |
+
|
166 |
+
output_ids = model.generate(inputs["input_ids"], max_length=4096, do_sample=False, temperature=0.0, use_cache=True)
|
167 |
+
sequence_length = inputs["input_ids"].shape[1]
|
168 |
+
new_output_ids = output_ids[:, sequence_length:]
|
169 |
+
answers = tokenizer.batch_decode(new_output_ids, skip_special_tokens=True)
|
170 |
+
|
171 |
+
return answers
|
172 |
+
|
173 |
+
model_path = 'microsoft/Orca-2-13b'
|
174 |
+
|
175 |
+
system_message = "You are Orca, an AI language model created by Microsoft. You are a cautious assistant. You carefully follow instructions. You are helpful and harmless and you follow ethical guidelines and promote positive behavior."
|
176 |
+
user_message = "\" \n :You can't just say, \"\"that's crap\"\" and remove it without gaining a consensus. You already know this, based on your block history. —/ \" \nIs the comment obscene? \nOptions : Yes, No."
|
177 |
+
|
178 |
+
# We use Chat Markup Language https://github.com/MicrosoftDocs/azure-docs/blob/main/articles/ai-services/openai/includes/chat-markup-language.md#working-with-chat-markup-language-chatml
|
179 |
+
prompt = f"<|im_start|>system\n{system_message}<|im_end|>\n<|im_start|>user\n{user_message}<|im_end|>\n<|im_start|>assistant"
|
180 |
+
|
181 |
+
answers = run_inference(model_path, prompt)
|
182 |
+
final_output = answers[0] if not should_filter_out(answers[0]) else "[Content Filtered]"
|
183 |
+
|
184 |
+
print(final_output)
|
185 |
+
```
|