Skip to content

Pull requests: flashinfer-ai/flashinfer

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Added the cudnn backend Ragged KV Cache wrapper
#2352 opened Jan 14, 2026 by Anerudhan Loading…
3 tasks done
Optimize quantization function in large problem size
#2343 opened Jan 13, 2026 by Shunkangz Loading…
5 tasks done
Fused MoE non gated Relu2 NVFP4 & FP8
#2304 opened Jan 7, 2026 by amitz-nv Draft
5 tasks
feat: Add FP8/NVFP4 quant fusion for MNNVL Allreduce
#2263 opened Dec 24, 2025 by timlee0212 Loading…
5 tasks done
bugfix: skip CUTLASS kernel generation when AOT cache exists
#2248 opened Dec 19, 2025 by yongwww Loading…
3 of 5 tasks
fix: Handle zeros in Mistral Large 3 MoE inference
#2238 opened Dec 18, 2025 by dbari Loading…
9 tasks done
misc: support checks unit test tracking
#2224 opened Dec 16, 2025 by jimmyzho Loading…
5 tasks
refactor: update fa3 codebase [part 2]
#2192 opened Dec 9, 2025 by yzh119 Loading…
4 of 5 tasks
Add CUDA graph buffers for persistent attention
#2185 opened Dec 7, 2025 by Edenzzzz Loading…
5 tasks
feat: add sink to flashinfer decode
#2087 opened Nov 13, 2025 by djmmoss Loading…
Blockwise GEMM with all reduce overlapping
#2007 opened Oct 30, 2025 by Amir-19 Draft
5 tasks
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.