Spaces:
Running
on
T4
Running
on
T4
How to do batch reasoning?
#15
by
stu-lupeng
- opened
How am I going to do batch reasoning with this model?
Hey
@stu-lupeng
- could you clarify what you mean by 'batch reasoning'? As in 'batched inference'? The Translator
class provided in the seamless communication repository only supports batch size =1: inference/translator
You can use a lower-level API from the seamless communication repo if you want to explore running batched inference. The predict
function looks like it has the workings of a function that could generalise to being batched: inference/translator/predict
I would use this as a starting point! For reference, the transformers
integration of Seamless M4T is well underway with
@ylacombe
. This will support batched inference out of the box.