Coding performance of base model?

#11
by rombodawg - opened

Are you able to bench the fp16 model files on humaneval?

Id love to see how the coding performance is especially considering its a mixture of experts model, and those generally do well.

It hasn't been task finetuned at all, so it would probably make sense to await something like EVOL-Instruct, Chain of Code, or w/e is most current this week to be applied to the base model before doing a code eval.

@ricofix base coding performance is just as important as any other type of eval. Trust me you want to bench it before its finetuned

HumanEval Score is about 32.9%.

Thanks @TechxGenus You are the GOAT

Sign up or log in to comment