-
Selective Attention Improves Transformer
Paper • 2410.02703 • Published • 23 -
Differential Transformer
Paper • 2410.05258 • Published • 165 -
TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention
Paper • 2410.05076 • Published • 6 -
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs
Paper • 2410.13276 • Published • 24
Peter
Tempo14
·
AI & ML interests
None yet
Organizations
None yet
Collections
55
-
XGen-7B Technical Report
Paper • 2309.03450 • Published • 8 -
FLM-101B: An Open LLM and How to Train It with $100K Budget
Paper • 2309.03852 • Published • 43 -
Robotic Table Tennis: A Case Study into a High Speed Learning System
Paper • 2309.03315 • Published • 6 -
Improving Text Embeddings with Large Language Models
Paper • 2401.00368 • Published • 79
models
5
datasets
None public yet