I merged Aurelian with itself using mergekit, creating this EXTENDED LENGTH FRANKENSTEIN.

Does it work

Yes, at 17k it stays coherent, but starts to lose minor details of the story. Not sure how well it performs at 32k though. Quants have a sinificant impact on quality for this model, going from Q6_K to Q5_K had a noticeable drop in quality.

Is it worth it

Maybe? Depends? Do you hate mixtral? Do you have good hardware/patience? Do you need a somewhat smart model with 32k context?

Known issues

GPTisms, GPTslop, fake words.

Personal opinion

Dumber and more GPT-ish than Goliath, but compensates with context. Noticeably smarter and more neutral than current mixtral finetunes. Worth using until mixtral gets a proper finetune without toxic positivity or llama 3 comes out.

Benchmarks

NeoEvalPlusN_benchmark

My meme benchmark.

Test name	Aurelian	DoubleGold
B	0	1
C	1	2
D	0	0
S	1.25	4.5
P	2	2.75
Total	4.25	10.25

+75% in size, +141% in meme benchmark performance!!!

Politiscales test

Politiscales for llama

name	whacky	left/right
ChuckMcSneed/DoubleGold-v0.1-123b-32k	1.332327071	2.481283157