MarioBarbeque commited on
Commit
0aae6f5
·
verified ·
1 Parent(s): 25281be
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -27,7 +27,7 @@ The model weights of *CyberSolve LinAlg 1.2* are a further downstream checkpoint
27
 
28
 
29
  To construct **CyberSolve LinAlg 1.2**, the *FLAN-T5 large* model is fined-tuned using a custom PyTorch training loop optimized for multiple Nvidia A100 GPUs. We supervise a training of *FLAN-T5 large* on the *algebra__linear_1d* split of the Google DeepMind mathematics dataset, an open source
30
- dateset from Google DeepMind available through the 🤗 hub [deepmind/math_dataset](https://huggingface.co/datasets/deepmind/math_dataset). This large dataset consists of code generating mathematical problems and their solutions to a variety of tasks across unique mathematical disciplines.
31
 
32
  In this preliminary family of CyberSolve models, we are specifically interested in understanding the ability of neural models to solve non-trivial mathematical tasks. As such, the CyberSolve **LinAlg 1.x** family of models are trained on a set of 2M simpler, one-dimension linear equations.
33
  We preprocessed the data and simulated the training on a smaller, downsampled set of the dataset before training for multiple epochs over the dataset's entirety. This model in particular has been trained for 2 additional epochs, limited only by funds, beyond the original *CyberSolve LinAlg 1.1* checkpoint.
 
27
 
28
 
29
  To construct **CyberSolve LinAlg 1.2**, the *FLAN-T5 large* model is fined-tuned using a custom PyTorch training loop optimized for multiple Nvidia A100 GPUs. We supervise a training of *FLAN-T5 large* on the *algebra__linear_1d* split of the Google DeepMind mathematics dataset, an open source
30
+ dateset from Google DeepMind available through the 🤗 hub at [deepmind/math_dataset](https://huggingface.co/datasets/deepmind/math_dataset). This large dataset consists of code generating mathematical problems and their solutions to a variety of tasks across unique mathematical disciplines.
31
 
32
  In this preliminary family of CyberSolve models, we are specifically interested in understanding the ability of neural models to solve non-trivial mathematical tasks. As such, the CyberSolve **LinAlg 1.x** family of models are trained on a set of 2M simpler, one-dimension linear equations.
33
  We preprocessed the data and simulated the training on a smaller, downsampled set of the dataset before training for multiple epochs over the dataset's entirety. This model in particular has been trained for 2 additional epochs, limited only by funds, beyond the original *CyberSolve LinAlg 1.1* checkpoint.