Haili TIAN
haili-tian
AI & ML interests
None yet
Recent Activity
new activity
29 days ago
deepseek-ai/DeepSeek-R1:Lite version for DeepSeek-R1?
new activity
about 1 month ago
deepseek-ai/DeepSeek-R1-Distill-Llama-70B:weight files naming is not regular rule
new activity
about 1 month ago
deepseek-ai/DeepSeek-R1-Distill-Qwen-32B:weight files naming is not regular rule
Organizations
None yet
haili-tian's activity
Lite version for DeepSeek-R1?
1
#137 opened 29 days ago
by
haili-tian
weight files naming is not regular rule
#13 opened about 1 month ago
by
haili-tian
weight files naming is not regular rule
#29 opened about 1 month ago
by
haili-tian
bos_token_id is defined incorrectly
1
#28 opened about 1 month ago
by
haili-tian
System Prompt
18
#2 opened about 2 months ago
by
Wanfq

What temp are these expected to be used at?
2
#6 opened about 2 months ago
by
rombodawg

running on local machine
7
#19 opened about 1 month ago
by
saidavanam
System Prompt
13
#2 opened about 2 months ago
by
Wanfq

Can not use HF transformers for inference?
#11 opened 4 months ago
by
haili-tian
max_window_layers is 70?
2
#1 opened 6 months ago
by
haili-tian
sliding_window is null?
1
#84 opened 6 months ago
by
haili-tian
Qwen1.5 series, I choose Qwen1.5-32B
#3 opened 10 months ago
by
haili-tian
Qwen1.5-32B?
#4 opened 10 months ago
by
haili-tian