
Cognitive Computations
community
AI & ML interests
Supervised Fine Tuning, DPO, and unalignment
Recent Activity
View all activity
cognitivecomputations's activity
poor performance for DeepSeek-V3-AWQ
2
#9 opened 2 days ago
by
fridayl
The V3-AWQ model's response seems not as expected
12
#8 opened 4 days ago
by
juxing
Can't get 48 TPS on 8x H800
1
#21 opened 6 days ago
by
Light4Bear

Pipeline Parallellism
1
#20 opened 6 days ago
by
leo98xh
8*a100 OUT OF MEMORY
1
#19 opened 6 days ago
by
Jaren
requests get stuck when sending long prompts (already solved, but still don't know why?)
1
#18 opened 7 days ago
by
uv0xab
Significant Speed Drop with Increasing Input Length on H800 GPUs
2
#17 opened 7 days ago
by
wangkkk956
Docker start with vllm failed. Official vllm docker image 0.7.3
1
#7 opened 7 days ago
by
kuliev-vitaly
when i use vllm v0.7.2 to deploy r1 awq, i got empty content
13
#10 opened 15 days ago
by
bupalinyu
why "MLA is not supported with awq_marlin quantization. Disabling MLA." with 4090 * 32 (4 node / vllm 0.7.2)
3
#14 opened 8 days ago
by
FightLLM
when i run command ,it didnot work. ( via vllm 0.7.3)
2
#16 opened 8 days ago
by
xueshuai
skips the thinking process
11
#5 opened 20 days ago
by
muzizon
Any one can run this model with SGlang framework?
2
#13 opened 8 days ago
by
muziyongshixin
GPTQ Support
2
#1 opened about 2 months ago
by
warlock-edward
vllm support a100
17
#2 opened about 2 months ago
by
HuggingLianWang
Code used to convert this / could you do v3 base?
1
#3 opened about 1 month ago
by
deltanym

What calibration dataset do you use when applying AWQ?
2
#5 opened 15 days ago
by
HandH1998
Deployment framework
27
#2 opened about 1 month ago
by
xro7
MLA is not supported with moe_wna16 quantization. Disabling MLA.
5
#7 opened 17 days ago
by
AMOSE