"'RobertaFlashAttention2' object has no attribute 'q_lin'"
#1
by
wwydmanski
- opened
Dzień dobry!
Model działa wyśmienicie jeżeli nie próbuję używać flash attention - ale kiedy tylko dodaję tę linijkę, wyrzuca mi tytułowy błąd:
Traceback (most recent call last):
File "/opt/conda/lib/python3.11/site-packages/runpod/serverless/modules/rp_job.py", line 164, in run_job
handler_return = handler(job)
File "/workspace/handler.py", line 44, in handler
scores = model(**features).logits.squeeze(-1)
File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/sdadas/polish-reranker-roberta-v2/7296f335a1f0a6bc771df58e6c1319e8068fa583/modeling_roberta.py", line 1577, in forward
outputs = self.roberta(
File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/sdadas/polish-reranker-roberta-v2/7296f335a1f0a6bc771df58e6c1319e8068fa583/modeling_roberta.py", line 1214, in forward
encoder_outputs = self.encoder(
File "/root/.cache/huggingface/modules/transformers_modules/sdadas/polish-reranker-roberta-v2/7296f335a1f0a6bc771df58e6c1319e8068fa583/modeling_roberta.py", line 790, in forward
self_attention_outputs = self.attention(
File "/root/.cache/huggingface/modules/transformers_modules/sdadas/polish-reranker-roberta-v2/7296f335a1f0a6bc771df58e6c1319e8068fa583/modeling_roberta.py", line 717, in forward
self_outputs = self.self(
File "/root/.cache/huggingface/modules/transformers_modules/sdadas/polish-reranker-roberta-v2/7296f335a1f0a6bc771df58e6c1319e8068fa583/modeling_roberta.py", line 411, in forward
target_dtype = self.q_lin.weight.dtype
AttributeError: 'RobertaFlashAttention2' object has no attribute 'q_lin'
Korzystam z takiej wersji flash attention:
https://github.com/Dao-AILab/flash-attention/releases/download/v2.6.3/flash_attn-2.6.3+cu123torch2.4cxx11abiFALSE-cp311-cp311-linux_x86_64.whl
Ja mam wersję 2.6.3 instalowaną bezpośrednio z pypi (https://pypi.org/project/flash-attn/#history) i u mnie ten błąd nie występuje.