Cognitive Computations

community

https://erichartford.com

erhartford

ehartford

AI & ML interests

Supervised Fine Tuning, DPO, and unalignment

Recent Activity

v2ray new activity about 8 hours ago

cognitivecomputations/DeepSeek-V3-AWQ:poor performance for DeepSeek-V3-AWQ

v2ray new activity 1 day ago

cognitivecomputations/DeepSeek-V3-AWQ:The V3-AWQ model's response seems not as expected

v2ray new activity 6 days ago

cognitivecomputations/DeepSeek-R1-AWQ:Can't get 48 TPS on 8x H800

View all activity

cognitivecomputations's activity

v2ray

in cognitivecomputations/DeepSeek-V3-AWQ about 8 hours ago

poor performance for DeepSeek-V3-AWQ

#9 opened 2 days ago by

v2ray

in cognitivecomputations/DeepSeek-V3-AWQ 1 day ago

The V3-AWQ model's response seems not as expected

#8 opened 4 days ago by

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ 6 days ago

Can't get 48 TPS on 8x H800

#21 opened 6 days ago by

Pipeline Parallellism

#20 opened 6 days ago by

8*a100 OUT OF MEMORY

#19 opened 6 days ago by

requests get stuck when sending long prompts (already solved, but still don't know why?)

#18 opened 7 days ago by

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ 7 days ago

Significant Speed Drop with Increasing Input Length on H800 GPUs

#17 opened 7 days ago by

v2ray

in cognitivecomputations/DeepSeek-V3-AWQ 7 days ago

Docker start with vllm failed. Official vllm docker image 0.7.3

#7 opened 7 days ago by

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ 7 days ago

when i use vllm v0.7.2 to deploy r1 awq, i got empty content

#10 opened 15 days ago by

why "MLA is not supported with awq_marlin quantization. Disabling MLA." with 4090 * 32 (4 node / vllm 0.7.2)

#14 opened 8 days ago by

when i run command ,it didnot work. ( via vllm 0.7.3)

#16 opened 8 days ago by

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ 8 days ago

skips the thinking process

#5 opened 20 days ago by

Any one can run this model with SGlang framework？

#13 opened 8 days ago by

v2ray

in cognitivecomputations/DeepSeek-V3-AWQ 11 days ago

GPTQ Support

#1 opened about 2 months ago by

vllm support a100

#2 opened about 2 months ago by

HuggingLianWang

Code used to convert this / could you do v3 base?

#3 opened about 1 month ago by

What calibration dataset do you use when applying AWQ?

#5 opened 15 days ago by

v2ray

in cognitivecomputations/DeepSeek-R1-AWQ 11 days ago

Deployment framework

#2 opened about 1 month ago by

MLA is not supported with moe_wna16 quantization. Disabling MLA.

#7 opened 17 days ago by

triton.runtime.errors.OutOfResources: out of resource: shared memory, Required: 163840, Hardware limit: 101376. Reducing block sizes or `num_stages` may help

#9 opened 15 days ago by