FlashQLA
High-performance linear attention kernel library built on TileLang.
Open-source high-performance linear attention kernel library built on TileLang for GDN Chunked Prefill, with fused forward/backward optimizations and NVIDIA Hopper performance tuning.
Recent stories
1 linked story