File size: 8,935 Bytes
2744453
 
 
 
e7fa5d5
 
 
 
 
 
 
 
 
 
c01a71c
e7fa5d5
 
 
 
 
 
 
 
 
 
 
 
c01a71c
e7fa5d5
 
 
 
 
 
 
 
 
 
 
 
 
 
c01a71c
e7fa5d5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
---
YAML tags: "coming soon"
---

# APEX-E3 Dec-Enc Function-Call Model (v0.011)

**Model Name:** [usmankhanic/apexe3-dec-enc-fn-v0011](https://huggingface.co./usmankhanic/apexe3-dec-enc-fn-v0011)  
**Architecture:** T5-small (encoder-decoder)  
**Purpose:** Convert natural language queries into a structured *function call* format with parameters—especially tuned for capital markets applications that demand private, agentic solutions.

---

## Overview

The **APEX-E3 Dec-Enc Function-Call Model (v0.011)** is a fine-tuned T5-small model specialised in generating *function call structures* from plain English queries. Rather than producing unstructured text, this model outputs instructions for specific function calls, including all relevant parameters.

With a special focus on capital market use cases, it was trained on queries that map directly to functions like **`selectStocks`**, **`run_backtest`**, and **`optimizer`**, ensuring precise extraction of parameters needed for advanced trading, backtesting, and portfolio optimization workflows.

This solution is **lightweight**, **highly performant**, and—thanks to our **novel training approach**—**extraordinarily easy to re-train** or adapt to new function definitions and parameter sets.

### Key Features

1. **Laser-Focused on Capital Markets**  
   - Model is pre-trained and fine-tuned to parse finance- and trading-centric instructions.  
   - Produces direct calls to your functions with minimal overhead.

2. **Private & Agentic**  
   - Ideal for organisations seeking on-premises or private cloud solutions where data control and agentic autonomy are paramount.

3. **Ultra-Easy Training Approach**  
   - Using a simple Python/Flask app and structured JSON, you can re-train or extend the model on your own custom function definitions in *minutes*, no large-scale ML infrastructure required.  
   - Ingestion of minimal training data is enough to achieve high accuracy in mapping user queries to structured parameters.

4. **Lightweight, Fast Inference**  
   - Based on T5-small (~60M parameters), balancing performance with rapid inference, even in CPU-only setups.  
   - Perfect for real-time or near-real-time decision-making in capital markets.

---

## Model Details

- **Base Model**: [T5-small](https://huggingface.co./t5-small)  
- **Fine-Tuning**: Customised data mapping natural language to function calls, specifically in capital markets contexts.  
- **Parameter Count**: ~60M  
- **Tokenizer**: T5 SentencePiece tokenizer  

**Training Objective**  
To convert input prompts like:

```
"Your job is to pick the correct function name and produce key=value lines.

 Query: <USER_QUERY>

 Format:
 function_name=<NAME>
 param1=value
 param2=value
 ...
"
```

into **structured output** with the correct function name (e.g., `selectStocks`, `run_backtest`, or `optimizer`) and the corresponding parameters (`from=`, `to=`, `sector=`, etc.).

---

## Intended Use Cases

This model was specifically trained to parse user requests and map them onto the following *function signatures*:

```jsonc
{
  "name": "selectStocks",
  "description": "Select stocks based on specified criteria",
  "params": {
    "criteria": "string",
    "from": "string",
    "to": "string",
    "sector": "array",
    "fundamental_factors": "array"
  }
},
{
  "name": "run_backtest",
  "description": "Run a backtest on a given asset",
  "params": {
    "assetId": "string",
    "from": "string",
    "to": "string",
    "buyCondition": "string",
    "sellCondition": "string",
    "startingCapital": "float",
    "fees": "float"
  }
},
{
  "name": "optimizer",
  "description": "Optimize a portfolio with a given objective function",
  "params": {
    "assetIdList": "array",
    "timeFrame": "string",
    "objectiveFunction": "string"
  }
}
```

### Examples

1. **selectStocks**  
   - **User Query:** “Find high-growth technology stocks from 2022-01-01 to 2023-12-31, focusing on P/E ratio and dividend yield.”  
   - **Model Output** (possible):
     ```
     function_name=selectStocks
     criteria=high-growth
     from=2022-01-01
     to=2023-12-31
     sector=technology
     fundamental_factors=P/E ratio, dividend yield
     ```
   This structured output is ready for an internal function `selectStocks(...)`.

2. **run_backtest**  
   - **User Query:** “Run a backtest on asset AAPL from 2020 to 2022 with a simple buy condition of RSI<30 and sell condition of RSI>70, starting capital 100000, fees 0.1%.”  
   - **Model Output**:
     ```
     function_name=run_backtest
     assetId=AAPL
     from=2020
     to=2022
     buyCondition=RSI<30
     sellCondition=RSI>70
     startingCapital=100000
     fees=0.1
     ```

3. **optimizer**  
   - **User Query:** “Optimize a portfolio of [AAPL, TSLA, AMZN] using Markowitz strategy for Q1-2023.”  
   - **Model Output**:
     ```
     function_name=optimizer
     assetIdList=AAPL, TSLA, AMZN
     timeFrame=Q1-2023
     objectiveFunction=Markowitz
     ```

These examples showcase how the model turns free-form text into direct function invocations.  

---

## Innovative Training Approach

- **Single JSON File → Fine-Tuned Model**  
  With a minimal data set in JSON specifying `(query_text, correct_function_name, correct_params)`, the included script fine-tunes T5-small specifically for your business logic.  
- **Rapid Iteration**  
  A typical training run of just a few epochs quickly adapts the model. This is a *truly innovative* approach, enabling agile updates to match evolving requirements or new function signatures.  
- **Scalable**  
  Despite the model’s small size, you can extend the training data seamlessly. As new parameters or entirely new functions emerge, simply add them to the training set and run the script again.

In short, **you don’t need a huge ML pipeline**—just minimal code and data. Our novel approach ensures top-tier performance with minimal overhead, empowering capital markets teams to create private, agentic solutions on their own infrastructure.

---

## Performance and Metrics

- **High Accuracy** on parameter extraction from real-world financial queries.  
- **Minimal Hallucination** for short, well-structured prompts.  
- **Smooth Generalization** to novel queries involving a mix of known parameters.  

Complex or ambiguous requests might require clarifying instructions or additional data. Nevertheless, in practice, our fine-tuned T5 has demonstrated strong consistency, making it a reliable component of capital market automation.

---

## Usage Example

```python
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "usmankhanic/apexe3-dec-enc-fn-v0011"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

# Sample prompt aligns with training format
input_text = (
    "Your job is to pick the correct function name and produce key=value lines.\n\n"
    "Query: Select the top growth stocks in the healthcare sector from 2021 to 2023.\n\n"
    "Format:\nfunction_name=<NAME>\nparam1=value\nparam2=value\n"
)

inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(
    **inputs,
    max_length=128,       # Increase if you expect longer output
    num_beams=4, 
    early_stopping=True
)

generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)
```

**Expected Output** (example):
```
function_name=selectStocks
criteria=top growth
from=2021
to=2023
sector=healthcare
fundamental_factors=
```
*(Model might produce or omit certain factors depending on the prompt details.)*

---

## Limitations

1. **Domain-Specific Queries**  
   The model is specialized for capital market queries. In other domains, the output may be less reliable.
2. **Parameter Guesswork**  
   If a prompt is vague or references unknown fields, T5 may produce extraneous parameters or missing info.
3. **Context Window**  
   T5-small typically handles up to ~512 tokens effectively. Extremely long queries risk truncation.

---

## License & Citation

This model is provided under the [Apache-2.0 License](https://www.apache.org/licenses/LICENSE-2.0). For citation, please reference:
```
@misc{usmankhanic_apexe3_2025,
  title={APEX-E3 Dec-Enc Function-Call Model (v0.011)},
  author={Khan, Usman and Contributors},
  year={2025},
  howpublished={\url{https://huggingface.co./usmankhanic/apexe3-dec-enc-fn-v0011}},
}
```
If you find this model helpful, please consider giving it a ⭐ on [Hugging Face](https://huggingface.co./usmankhanic/apexe3-dec-enc-fn-v0011).

---

## Contact & Contributions

For questions, feedback, or contributions, feel free to open an [issue or pull request](https://huggingface.co./usmankhanic/apexe3-dec-enc-fn-v0011/discussions).  

**Empower your capital markets pipeline with fast, private, and agentic function-call modeling—made effortless by our novel training approach!**