Running 1.79k 1.79k The Ultra-Scale Playbook π The ultimate guide to training LLM on large GPU Clusters
Running 1.79k 1.79k The Ultra-Scale Playbook π The ultimate guide to training LLM on large GPU Clusters
Domino: Eliminating Communication in LLM Training via Generic Tensor Slicing and Overlapping Paper β’ 2409.15241 β’ Published Sep 23, 2024 β’ 1
Scaling Laws for Floating Point Quantization Training Paper β’ 2501.02423 β’ Published Jan 5 β’ 26
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper β’ 2404.14219 β’ Published Apr 22, 2024 β’ 256
view post Post 3371 Native tensor parallel has landed in transformers!!! https://github.com/huggingface/transformers/pull/34184 thanks a lot to the torch team for their support! Contributions are welcome to support more models! π₯ π₯ 13 13 β€οΈ 5 5 π€― 3 3 π€ 3 3 + Reply