Spaces:

Qwen
/

qwen1.5-MoE-A2.7B-Chat-demo

Running

How is the inference so fast in this free hardware space?

by mahiatlinux - opened Apr 7

Apr 7

How is the inference so fast in this free hardware space?

Apr 7

because that's advantage of this arch.
you really using like 2.7B to generate token

Qwen org Apr 18

Haha, it uses an API service; not actually running in this free hardwarse space.

jklj077 changed discussion status to closed Apr 18

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment