Inquiry About Response Errors During eval

#13
by HaksooMoon - opened

Hello, first of all, thank you for releasing the model.

For the evaluation of the performance of the model, certain characters/words are repeatedly appearing in the responses during eval. It seems that someone else has posted a similar issue in the discussion. The same problem appears to occur when the model is deployed on a GPU.

The environment I tested on is AWS SageMaker A100 40G. I would like to inquire if there is a way to operate without any issues in a GPU environment.

For reference, the tests were conducted with Transformers version 4.41.

Thank you!

LG AI Research org
β€’
edited Aug 19, 2024

Could you please provide the information below?
- input with special tokens:
[|system|]You are EXAONE model from LG AI Research, a helpful assistant.[|endofturn|]\n[|user|]λ„ˆμ˜ μ†Œμ›μ„ 말해봐\n[|assistant|]
- output with special tokens:
EXAONE 3.0 λͺ¨λΈλ‘œμ„œ, μ €μ˜ 주된 λͺ©μ μ€ μ‚¬μš©μžμ—κ²Œ μ •ν™•ν•˜κ³  μœ μš©ν•œ 정보λ₯Ό μ œκ³΅ν•˜λŠ” κ²ƒμž…λ‹ˆλ‹€. μ €λŠ” λ‹€μ–‘ν•œ μ§ˆλ¬Έμ— λ‹΅λ³€ν•˜κ³ , 문제λ₯Ό ν•΄κ²°ν•˜λ©°, ν•™μŠ΅κ³Ό 연ꡬλ₯Ό 돕기 μœ„ν•΄ μ„€κ³„λ˜μ—ˆμŠ΅λ‹ˆλ‹€. μ œκ°€ λ„μšΈ 수 μžˆλŠ” λ‹€λ₯Έ 방법이 μžˆλ‹€λ©΄ μ–Έμ œλ“ μ§€ 말씀해 μ£Όμ„Έμš”![|endofturn|]
- generation configurations:
A100 40GB, torch==2.3.1, transformers==4.43.3, flash-attn==2.6.3, accelerate==0.33.0, greedy search (or top-k=?, top-p=?, ...), repetition penalty=1.0, ...

The above issue has been resolved.
However, could you provide the performance details for HumanEval-x?

LG AI Research org

Unfortunately, the HumanEval-x benchmark was not evaluated.

Ok. Thank you!

HaksooMoon changed discussion status to closed

Sign up or log in to comment