Garbage output ?

#30

by danielus - opened Jul 24

Jul 24

Is it just me or is there something wrong with this model? I can't even translate simple sentences from Italian to English, he seems constantly hallucinating. Both the version on Groq, the local quantized q4_0 version, and the fp16 version via vLLM, this model can't follow the instructions and goes on its own

Here an example

And when he actually does manage to translate it, he mistranslates it.

The strange thing is that I'm searching around but I only find people talking about the amazing benchmarks and no one is complaining, so at this point I guess I'm the problem 😂😂😂

danielus

Jul 24

•

edited Jul 24

Another example just for completeness, with vLLM fp16 version running on a VM istance with L4 on Google Cloud:

The translation is really bad

FoxMulder45

Jul 24

Another example just for completeness, with vLLM fp16 version running on a VM istance with L4 on Google Cloud:

The translation is really bad
You tell it to forget everything it knows, it forgets languages :)

CHNtentes

Jul 24

I don't know Italian, but I tested English to Chinese translation. While it is obviously worse than Claude 3.5, it is not as bad as "garbage".

danielus

Jul 24

•

edited Jul 24

I don't know Italian, but I tested English to Chinese translation. While it is obviously worse than Claude 3.5, it is not as bad as "garbage".

Yes, I would understand if it is a little wrong, but the translation is just an example, it fails even in the simplest tasks such as a dialogue or a summary, the feeling is that it is a base model and not instruct, because sometimes it just goes off on its own in its reasoning and dialogues, I would say almost unusable, I would specify that the same prompt on models such as gem 9b or Mistral 7b work quite well in comparison

hrishbhdalal

Jul 24

why would you make it forget everything it knows haha? maybe try to just ask it to translate haha. do not lobotomize the poor fella

danielus

Jul 24

why would you make it forget everything it knows haha? maybe try to just ask it to translate haha. do not lobotomize the poor fella

hahahah this prompt works particularly well because often models add to the answer useless details and considerations, in fact works well even with model that are not specifically multilangual. (I admit I found this prompt lying around on huggingface and kept it because it worked particularly well)

AndyLuo1029

Jul 25

I use Llama3.1-8B-Instruct to do some multilingual translation task, it DO GENERATE GARBAGE OUTPUT like :

So sad can't fine-tune this model to be our newest evaluation model due to its instability.

jeeyah

15 days ago

I am having the same problem with non-quantized, default settings running on 48GB VRAM on runpod.com.

I asked it to generate a story idea given a setting of Los Angeles, 1943 in the style of Steven King.

It started out fine and then...
The central conflict of the story revolves around a series of seemingly unrelated murders, all of which take place in the dead of night, during the brutal blackouts that have become a necessary evil in the war-torn city. Hawk and Rachel team up to investigate a small crew of los angeles police department officers who park a black dodge and dodge the alleyway where there latest victim "Lola Lee craving iPad label consisting irridient floor and drink liquor ble lockcase"

A major twist in the plot comes when the unlikely pair stumble upon an oily former circus performer known only as Cazzo Vallance, whose uncanny likeness to the main witness - Roger Arthurll EMO ATT fix hairstyle Given Besar stop captain faculty investigating scoop es unknown positioning stabilization

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment