Lewdiculous/SOVL_Llama3_8B-GGUF-IQ-Imatrix

Apr 25, 2024

Since Llama-3 dropped I have messed with yours. All have had issues here or there especially the porpoise ones. What I mean is either formatting, weird outputs, and not super uncensored. This one however has been perfect thus far. Has not refused anything, has not asked for permission for every step of the way, has not written a peep for me. All in all whatever you did here, your on the right path, also its quick as hell. Thanks!

Lewdiculous

Owner Apr 25, 2024

•

edited Apr 25, 2024

What I mean is either formatting, weird outputs, and not super uncensored.

You and me both. ~~We all.~~

This model is an experimental gem by @jeiku , I'm sure they will be happy to hear your positive feedback.

@Nitral-AI Poppys are sweet aswell, "teamwork makes the dreamwork".

I humbly offer my free time to upload my own quants since I'm already doing them anyways.

Nitral-AI

Apr 25, 2024

•

edited Apr 25, 2024

Since Llama-3 dropped I have messed with yours. All have had issues here or there especially the porpoise ones. What I mean is either formatting, weird outputs, and not super uncensored. This one however has been perfect thus far. Has not refused anything, has not asked for permission for every step of the way, has not written a peep for me. All in all whatever you did here, your on the right path, also its quick as hell. Thanks!

Make sure you are using the official presets with poppy provided in the repo. Version 0.7 has been getting fairly positive results from those using it correctly.

However, i can agree SOVL_Llama3_8B is stellar in my own testing. <3 @jeiku

saishf

Apr 25, 2024

Also I just played around with this model in IQ4_XS, when using the poppy porpoise samplers it goes all weird and spits out random things from the context word for word. But using the 3.1.0 it doesn't, I'm guessing it's to do with the really high temp (5).
When using the poppy context and Instruct preset and 3.1.0 samplers this model is nearing solar level understanding of the character cards, like including that a character has a British accent which majority of mistrals never did.
I still have to play around more though.

Lewdiculous

Owner Apr 25, 2024

Thanks for the heads up. 5 is pretty high for temperature if there aren't other samplers bringing it down, like Quadratic Sampling. I'll keep the recommendation for the 3.1.0 preset for now.

Nitral-AI

Apr 25, 2024

Also I just played around with this model in IQ4_XS, when using the poppy porpoise samplers it goes all weird and spits out random things from the context word for word. But using the 3.1.0 it doesn't, I'm guessing it's to do with the really high temp (5).
When using the poppy context and Instruct preset and 3.1.0 samplers this model is nearing solar level understanding of the character cards, like including that a character has a British accent which majority of mistrals never did.
I still have to play around more though.

What backend?

Nitral-AI

Apr 25, 2024

Thanks for the heads up. 5 is pretty high for temperature if there aren't other samplers bringing it down, like Quadratic Sampling. I'll keep the recommendation for the 3.1.0 preset for now.

The preset uses smooth sampling and min p, if you think the temp is too high drop to 3 or 1.

saishf

Apr 25, 2024

Also I just played around with this model in IQ4_XS, when using the poppy porpoise samplers it goes all weird and spits out random things from the context word for word. But using the 3.1.0 it doesn't, I'm guessing it's to do with the really high temp (5).
When using the poppy context and Instruct preset and 3.1.0 samplers this model is nearing solar level understanding of the character cards, like including that a character has a British accent which majority of mistrals never did.
I still have to play around more though.

What backend?

https://github.com/Nexesenex/kobold.cpp/releases
1.63d
It uses less vram :3

saishf

Apr 25, 2024

Thanks for the heads up. 5 is pretty high for temperature if there aren't other samplers bringing it down, like Quadratic Sampling. I'll keep the recommendation for the 3.1.0 preset for now.

The preset uses smooth sampling and min p, if you think the temp is too high drop to 3 or 1.

i don't usually mess with samplers so i kinda just bounce from preset to preset. I'll try lowering the temp.

Nitral-AI

Apr 25, 2024

•

edited Apr 25, 2024

Also I just played around with this model in IQ4_XS, when using the poppy porpoise samplers it goes all weird and spits out random things from the context word for word. But using the 3.1.0 it doesn't, I'm guessing it's to do with the really high temp (5).
When using the poppy context and Instruct preset and 3.1.0 samplers this model is nearing solar level understanding of the character cards, like including that a character has a British accent which majority of mistrals never did.
I still have to play around more though.

What backend?

https://github.com/Nexesenex/kobold.cpp/releases
1.63d
It uses less vram :3

Tested with this branch and it seems fine, please report back when you try the alternative temps so i can adjust the presets for the best approximates for general users. (trying to provide the easiest drag and drop solution for users regarding llama 3 models atm)

saishf

Apr 25, 2024

•

edited Apr 25, 2024

Here's some to see for yourself, I've never excelled with english so I can't pick up on the small details but I would call 5 pretty bad in comparison to the others. I like 3 though, the detail about the lantern casting light is a nice touch.
Temp: 5

(For context, he has the torch that never goes out not the user)
Temp: 3

Temp: 2.5

Temp: 2

Temp: 1.5

Temp: 1

Edit - Fix wording

Bit101

Apr 25, 2024

This model seems to prefer formats like [actions "utterance"]. Other formats like [actions "utterance"] and [actions utterance] don't perform as well.

Lewdiculous

Owner Apr 25, 2024

•

edited Apr 25, 2024

This model seems to prefer formats like [actions "utterance"]. Other formats like [actions "utterance"] and [actions utterance] don't perform as well.

@Bit101 This is intentional, that's the goal and that's usually the most popular RP formatting for ST - personally I'm a big fan of it.

Ardvark123

Apr 25, 2024

This model seems to prefer formats like [actions "utterance"]. Other formats like [actions "utterance"] and [actions utterance] don't perform as well.

@Bit101 This is intentional, that's the goal and that's usually the most popular RP formatting for ST - personally I'm a big fan of it.

Yeah I always did them the way this model prefers. Apparently I just realized things like chargen to help make cards says its bad to do only use novel or markdown never a combo. I dunno I like it better, feels more natural to me.

Bit101

Apr 25, 2024

This model seems to prefer formats like [actions "utterance"]. Other formats like [actions "utterance"] and [actions utterance] don't perform as well.

@Bit101 This is intentional, that's the goal and that's usually the most popular RP formatting for ST - personally I'm a big fan of it.

I have a lot of characters that I've collected so it's a bit annoying when I can only use 2/3rds of them and that's "intentional"

jeiku

Apr 26, 2024

Since Llama-3 dropped I have messed with yours. All have had issues here or there especially the porpoise ones. What I mean is either formatting, weird outputs, and not super uncensored. This one however has been perfect thus far. Has not refused anything, has not asked for permission for every step of the way, has not written a peep for me. All in all whatever you did here, your on the right path, also its quick as hell. Thanks!

Poppy is Nitral's line of models, mine were the Aura/Aurora models. Thanks for your feedback, happy to hear you like it!

jeiku

Apr 26, 2024

This model seems to prefer formats like [actions "utterance"]. Other formats like [actions "utterance"] and [actions utterance] don't perform as well.

I'm not having any trouble with action utterance, it is the only format I use. I did give this model special training for action "utterance", but it should still do the other way fine. No experience with action "utterance" though, so maybe it doesn't work with that. Be sure that your card's example messages are formatted the same way the intro and text are, as you will have issues if not.

Lewdiculous

Owner Apr 26, 2024

This model seems to prefer formats like [actions "utterance"]. Other formats like [actions "utterance"] and [actions utterance] don't perform as well.

@Bit101 This is intentional, that's the goal and that's usually the most popular RP formatting for ST - personally I'm a big fan of it.

I have a lot of characters that I've collected so it's a bit annoying when I can only use 2/3rds of them and that's "intentional"

This is an issue that has plagued Llama-8Bs, Jeiku is trying to alleviate it but it's not simple. Let's cross our fingers.

saishf

Apr 26, 2024

No experience with action "utterance" though, so maybe it doesn't work with that. Be sure that your card's example messages are formatted the same way the intro and text are, as you will have issues if not.

This format seems to be the bane of Llama3 models, but i just gave in and switched to action "utterance" with a larger text size so i can read it.
From my experience if you have the character using action "utterance" you can respond in any format in your response and the character won't break their formatting. But that doesn't work when having the character use any other type of formatting so i'd guess it's strongest format is action "utterance"

Lewdiculous changed discussion status to closed Apr 30, 2024

Lewdiculous
/

SOVL_Llama3_8B-GGUF-IQ-Imatrix

Review!