MarioBarbeque
commited on
fix typo
Browse files
README.md
CHANGED
@@ -27,7 +27,7 @@ The model weights of *CyberSolve LinAlg 1.2* are a further downstream checkpoint
|
|
27 |
|
28 |
|
29 |
To construct **CyberSolve LinAlg 1.2**, the *FLAN-T5 large* model is fined-tuned using a custom PyTorch training loop optimized for multiple Nvidia A100 GPUs. We supervise a training of *FLAN-T5 large* on the *algebra__linear_1d* split of the Google DeepMind mathematics dataset, an open source
|
30 |
-
dateset from Google DeepMind available through the 🤗 hub [deepmind/math_dataset](https://huggingface.co/datasets/deepmind/math_dataset). This large dataset consists of code generating mathematical problems and their solutions to a variety of tasks across unique mathematical disciplines.
|
31 |
|
32 |
In this preliminary family of CyberSolve models, we are specifically interested in understanding the ability of neural models to solve non-trivial mathematical tasks. As such, the CyberSolve **LinAlg 1.x** family of models are trained on a set of 2M simpler, one-dimension linear equations.
|
33 |
We preprocessed the data and simulated the training on a smaller, downsampled set of the dataset before training for multiple epochs over the dataset's entirety. This model in particular has been trained for 2 additional epochs, limited only by funds, beyond the original *CyberSolve LinAlg 1.1* checkpoint.
|
|
|
27 |
|
28 |
|
29 |
To construct **CyberSolve LinAlg 1.2**, the *FLAN-T5 large* model is fined-tuned using a custom PyTorch training loop optimized for multiple Nvidia A100 GPUs. We supervise a training of *FLAN-T5 large* on the *algebra__linear_1d* split of the Google DeepMind mathematics dataset, an open source
|
30 |
+
dateset from Google DeepMind available through the 🤗 hub at [deepmind/math_dataset](https://huggingface.co/datasets/deepmind/math_dataset). This large dataset consists of code generating mathematical problems and their solutions to a variety of tasks across unique mathematical disciplines.
|
31 |
|
32 |
In this preliminary family of CyberSolve models, we are specifically interested in understanding the ability of neural models to solve non-trivial mathematical tasks. As such, the CyberSolve **LinAlg 1.x** family of models are trained on a set of 2M simpler, one-dimension linear equations.
|
33 |
We preprocessed the data and simulated the training on a smaller, downsampled set of the dataset before training for multiple epochs over the dataset's entirety. This model in particular has been trained for 2 additional epochs, limited only by funds, beyond the original *CyberSolve LinAlg 1.1* checkpoint.
|