use in web browser
#3
by
ciekawy
- opened
ok, I managed to run everything with transformers v3 branch and latest onnxruntime-web
thanks to https://github.com/microsoft/onnxruntime/issues/20876
however I noticed now that that the wasm is up to 2x faster than webgpu on Apple M3 and enough RAM (with quantized model, measuring just single extractor calls)
however I noticed now that that the wasm is up to 2x faster than webgpu on Apple M3 and enough RAM (with quantized model, measuring just single extractor calls)
I would recommend setting the dtype to fp16 or q4: with await pipeline('feature-extraction', 'Xenova/bge-m3', { dtype: 'fp16' })
for example.