Faith score went from 8% to 54%. Expect more updates and increase in the score. I also did the instruct fine tuning before adding faith to the model. So some of the improvements may be there because I started with llama 3.1 base and not the instruct.
Here are some comparisons with original Llama 3.1: