Inference/Serving S-LoRA: Serving Thousands of Concurrent LoRA Adapters Paper • 2311.03285 • Published Nov 6, 2023 • 28
S-LoRA: Serving Thousands of Concurrent LoRA Adapters Paper • 2311.03285 • Published Nov 6, 2023 • 28
Training/Parallelism Ultra-Long Sequence Distributed Transformer Paper • 2311.02382 • Published Nov 4, 2023 • 2 Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster Paper • 2311.08263 • Published Nov 14, 2023 • 15
Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster Paper • 2311.08263 • Published Nov 14, 2023 • 15