How is the inference so fast in this free hardware space?
#1
by
mahiatlinux
- opened
How is the inference so fast in this free hardware space?
because that's advantage of this arch.
you really using like 2.7B to generate token
Haha, it uses an API service; not actually running in this free hardwarse space.
jklj077
changed discussion status to
closed