DanielShaw98 commited on
Commit
76763e7
1 Parent(s): 6c80763

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +289 -23
README.md CHANGED
@@ -1,23 +1,289 @@
1
- ---
2
- base_model: unsloth/phi-3.5-mini-instruct-bnb-4bit
3
- language:
4
- - en
5
- license: apache-2.0
6
- tags:
7
- - text-generation-inference
8
- - transformers
9
- - unsloth
10
- - llama
11
- - trl
12
- - sft
13
- ---
14
-
15
- # Uploaded model
16
-
17
- - **Developed by:** DanielShaw98
18
- - **License:** apache-2.0
19
- - **Finetuned from model :** unsloth/phi-3.5-mini-instruct-bnb-4bit
20
-
21
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
22
-
23
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - microsoft/Phi-3.5-mini-instruct
4
+ language:
5
+ - en
6
+ license: apache-2.0
7
+ tags:
8
+ - text-generation-inference
9
+ - mircosoft
10
+ - phi-3.5-mini-instruct
11
+ - law
12
+ - contracts
13
+ - mergers&acquisitions
14
+ ---
15
+
16
+ # Phi-3.5-Law
17
+
18
+ ## Model Summary
19
+
20
+ **Model Name:** Phi-3.5-Law
21
+ **Model Type:** Text Generation
22
+ **Hugging Face Model ID:** DanielShaw98/phi-3.5-law
23
+
24
+ **Phi-3.5-Law** is a specialised model for processing merger and acquisition (M&A) contracts. It is fine-tuned from the `microsoft/Phi-3.5-mini-instruct` model and designed to assist lawyers in identifying specific clauses within contracts. The model is built on the Phi-3 architecture, which focuses on high-quality, reasoning-dense data.
25
+
26
+ **Developed by:** DanielShaw98
27
+ **License:** Apache-2.0
28
+ **Finetuned from model:** `microsoft/Phi-3.5-mini-instruct`
29
+
30
+ ## Known Issues
31
+
32
+ ### Model Behavior in Different Environments
33
+
34
+ **Google Colab:** The model has been tested successfully on Google Colab, where it performs as expected and provides accurate results.
35
+
36
+ **Downloaded Environment:** When downloaded and tested locally, the model may produce gibberish or unexpected outputs. This issue may be due to differences in the runtime environment or configurations between Google Colab and local setups.
37
+
38
+ If you encounter issues with the model's performance in a local environment, please ensure that all dependencies are correctly installed and that the runtime environment is properly configured. If the problem persists, consider reaching out for support or checking for any updates or patches.
39
+
40
+ ## Intended Uses
41
+
42
+ ### Primary Use Cases
43
+
44
+ The Phi-3.5-Law model is intended for use in legal and contract analysis applications, where it helps in:
45
+
46
+ - Identifying specific clauses in M&A contracts
47
+
48
+ - Providing explanations and details about these clauses
49
+
50
+
51
+ The model is useful for legal professionals who need to quickly locate and understand relevant contract clauses.
52
+
53
+ ### Use Case Considerations
54
+
55
+ While the model aims to provide accurate results, it may occasionally produce inaccuracies. The AI tool should not be considered legal advice or a substitute for professional judgement. Users should carefully review and verify the results before making any legal decisions or actions.
56
+
57
+ ## Setup and Usage
58
+
59
+ ### Setting Up the Server
60
+
61
+ To run the model, you need a FastAPI server. Here’s an example script for setting up the server:
62
+
63
+ from fastapi import FastAPI, HTTPException, Depends
64
+
65
+ from pydantic import BaseModel
66
+
67
+ from transformers import pipeline
68
+
69
+ import torch
70
+
71
+ from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
72
+
73
+ import os
74
+
75
+ from dotenv import load_dotenv
76
+
77
+
78
+ # Load environment variables from .env file
79
+
80
+ load_dotenv()
81
+
82
+ app = FastAPI()
83
+
84
+ token_auth_scheme = HTTPBearer()
85
+
86
+
87
+ # Retrieve the Hugging Face token from environment variable
88
+
89
+ HUGGINGFACE_TOKEN = os.getenv("HUGGINGFACE_TOKEN")
90
+
91
+
92
+ # Initialize the pipeline
93
+
94
+ model_id = "DanielShaw98/phi-3.5-law"
95
+
96
+ model_pipeline = pipeline(
97
+
98
+ "text-generation",
99
+
100
+ model=model_id,
101
+
102
+ model_kwargs={"torch_dtype": torch.bfloat16},
103
+
104
+ device_map="auto" # Requires accelerate to work properly
105
+
106
+ )
107
+
108
+
109
+ class Message(BaseModel):
110
+
111
+ role: str
112
+
113
+ content: str
114
+
115
+
116
+ class RequestBody(BaseModel):
117
+
118
+ messages: list[Message]
119
+
120
+ max_new_tokens: int
121
+
122
+
123
+ # Function to verify token
124
+
125
+ async def verify_token(credentials: HTTPAuthorizationCredentials = Depends(token_auth_scheme)):
126
+
127
+ token = credentials.credentials
128
+
129
+ if token != HUGGINGFACE_TOKEN:
130
+
131
+ raise HTTPException(status_code=401, detail="Invalid or missing token")
132
+
133
+ return token
134
+
135
+
136
+ @app.post("/generate")
137
+
138
+ async def generate_text(request_body: RequestBody, token: str = Depends(verify_token)):
139
+
140
+ try:
141
+
142
+ outputs = model_pipeline(
143
+
144
+ [{"role": msg.role, "content": msg.content} for msg in request_body.messages],
145
+
146
+ max_new_tokens=request_body.max_new_tokens,
147
+
148
+ )
149
+
150
+ return {"generated_text": outputs[0]["generated_text"]}
151
+
152
+ except Exception as e:
153
+
154
+ raise HTTPException(status_code=500, detail=str(e))
155
+
156
+
157
+ if __name__ == "__main__":
158
+
159
+ import uvicorn
160
+
161
+ uvicorn.run(app, host="0.0.0.0", port=8000)
162
+
163
+
164
+
165
+ ### Making a Call to the Server
166
+
167
+ To make a call to the server and test the model, you can use the following JavaScript code (using example chunk from contract):
168
+
169
+ const axios = require('axios');
170
+
171
+ require('dotenv').config(); // Make sure to install dotenv
172
+
173
+
174
+
175
+ // Retrieve the token from environment variables
176
+
177
+ const HUGGINGFACE_TOKEN = process.env.HUGGINGFACE_TOKEN;
178
+
179
+
180
+
181
+ const requestBody = {
182
+
183
+ messages: [
184
+
185
+ { role: "system",
186
+
187
+ content: `You are Legal AI. Your job is to help lawyers by identifying specific clauses in merger and acquisition contracts.
188
+
189
+ Please identify the desired clauses and also provide an explanation for this choice based on the prompt.
190
+
191
+ If the requested clause cannot be found, please respond with 'nothing found.'
192
+
193
+ Otherwise please provide a response in the following JSON format:
194
+
195
+ {
196
+
197
+ "entries": [
198
+
199
+ {
200
+
201
+ "page": <page_number>,
202
+
203
+ "line_start": <clause_start_line_within_chunk>,
204
+
205
+ "line_end": <clause_end_line_within_chunk>,
206
+
207
+ "clause": <clause_text>,
208
+
209
+ "explanation": <explanation_text>
210
+
211
+ }
212
+
213
+ ]
214
+
215
+ }`
216
+
217
+ },
218
+
219
+ { role: "user",
220
+
221
+ content: `Prompt: Review the provided text and identify all clauses related to termination rights and conditions.
222
+
223
+ Return the exact start and end line numbers of the relevant clause within the chunk. If you cannot find anything relevant, please respond with 'nothing found.'
224
+
225
+ Please provide details in the following JSON format:
226
+
227
+ { "relevant_chunks_found": <number>, "entries": [ { "page": <page_number>,
228
+
229
+ "line_start": <clause_start_line_within_chunk>, "line_end": <clause_end_line_within_chunk>,
230
+
231
+ "clause": <clause_text>, "explanation": <explanation_text> } ]}\n\n
232
+
233
+ Chunk: Transaction but all the conditions therein have been satisfied or complied with, \nor confirmed no such clearance is required in accordance with the applicable \ncompetition legislation, or has not objected to the Transaction within the time \nperiod prescribed by law.
234
+
235
+ \n227876-4-1460-v9.0 \n- 30 - \n70-40688062 \n \nFor the purposes of clauses 4.1.10 to 4.1.12 (inclusive) only, "Transaction" shall \nbe limited to the part or parts of the Transaction required to be notified to the \nCommission, COFECE or the competent competition authority of Vietnam (as \nappropriate).
236
+
237
+ \nNo material breach \n4.1.13 no Purchaser Covenant Breach and no Purchaser Material Breach having \noccurred; and \n4.1.14 no Chrysaor Covenant Breach and no Chrysaor Material Breach having \noccurred. \n4.2 \nAny Regulatory Condition or Antitrust Condition may be waived at any time on or \nbefore 17.00 on the Longstop Date by written agreement of the Company and the \nPurchaser.
238
+
239
+ Any Chrysaor Material Breach may be waived at any time on or before \n17.00 on the Longstop Date by the Purchaser by notice in writing to the Company. Any \nPurchaser Material Breach may be waived at any time on or before 17.00 on the \nLongstop Date by the Company by notice in writing to the Purchaser.
240
+
241
+ \n4.3 \nIf, at any time, any party becomes aware of a fact, matter or circumstance that could \nreasonably be expected to prevent or delay the satisfaction of a Condition, it shall \ninform the others of the fact, matter or circumstance as soon as reasonably practicable.
242
+
243
+ \n4.4 \nIf a Condition has not been satisfied or (if capable of waiver) waived by 17.00 on the \nLongstop Date or becomes impossible to satisfy before that time, either the \nHarbour/Chrysaor Parties or the Purchaser may terminate this Agreement by notice in \nwriting to that effect to the other, save that the Harbour/Chrysaor Parties may only \nterminate this Agreement: (i) on the basis of the Whitewash Condition not having been \nsatisfied by 17.00 on the Longstop Date or having become impossible to satisfy before
244
+
245
+ \nthat time; and (ii) on the basis of the Circular Condition and/or the FCA Admission \nCondition not having been satisfied by 17.00 on the Longstop Date or having become \nimpossible to satisfy before that time, in each case, only if the Harbour/Chrysaor Parties \nhave complied with the relevant provisions of clause 5 and/or the Purchaser has not \ncomplied with the relevant provisions of clause 5. \n4.5\n\n Chunk Meta-Data:\nPage Start: 30\n Page End: 31\nLine Start: 1405\n Line End: 1445`
246
+
247
+ }
248
+
249
+ ],
250
+
251
+ max_new_tokens: 256
252
+
253
+ };
254
+
255
+
256
+
257
+ axios.post('http://localhost:8000/generate', requestBody, {
258
+
259
+ headers: {
260
+
261
+ 'Authorization': `Bearer ${HUGGINGFACE_TOKEN}`,
262
+
263
+ 'Content-Type': 'application/json'
264
+
265
+ }
266
+
267
+ })
268
+
269
+ .then(response => {
270
+
271
+ console.log('Response:', response.data);
272
+
273
+ })
274
+
275
+ .catch(error => {
276
+
277
+ console.error('Error:', error.response ? error.response.data : error.message);
278
+
279
+ });
280
+
281
+
282
+
283
+ ## Disclaimer
284
+
285
+ This AI tool is designed to assist lawyers in identifying specific clauses within contracts. While the AI strives to provide accurate and relevant results, it may occasionally make mistakes or produce inaccuracies. The information provided by this tool should not be considered legal advice or a substitute for professional judgment. Users are strongly encouraged to carefully review and verify the results before relying on them for any legal decisions or actions. The responsibility for the interpretation and application of the contract clauses remains solely with the user.
286
+
287
+ ## Repository
288
+
289
+ For creating datasets and further details, please visit the [repository for creating datasets](https://github.com/DanielShaw98/data-prep).