bytedance-research/UI-TARS-7B-SFT
Image-Text-to-Text
•
Updated
•
816
•
84
Can you explain how you achieved the emotions? Was it the reference voice that had the emotion or was it in the text prompt?