patch inference on CPU & Windows + Update README snippets

#2
by tomaarsen HF staff - opened

Hello!

Pull Request overview

Details

Regarding the reference_compile config change: if that isn't done, then parts of the model are always compiled, even if the user does not have triton (a core requirement for compilation) or if they are running on CPU (which isn't compatible with compilation). Removing the option will help.

  • Tom Aarsen
tomaarsen changed pull request status to open
thenlper changed pull request status to merged

Sign up or log in to comment