-
Notifications
You must be signed in to change notification settings - Fork 646
Pull requests: flashinfer-ai/flashinfer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Added the cudnn backend Ragged KV Cache wrapper
#2352
opened Jan 14, 2026 by
Anerudhan
Loading…
3 tasks done
feat: add per-request generator support for sampling kernels
#2345
opened Jan 13, 2026 by
yzh119
Loading…
Optimize quantization function in large problem size
#2343
opened Jan 13, 2026 by
Shunkangz
Loading…
5 tasks done
feat: add LSE return support to TRT LLM attention kernels
#2332
opened Jan 11, 2026 by
yzh119
Loading…
feat: expose swizzled_input_sf parameter for CUTLASS fused MOE
#2330
opened Jan 11, 2026 by
yzh119
Loading…
[wip] feat: add bias support to TGV and CUTLASS BF16 GEMM
#2329
opened Jan 11, 2026 by
yzh119
Loading…
feat: add batch_invariant option to trtllm decode functions
#2321
opened Jan 9, 2026 by
yzh119
Loading…
[Perf][Feature] Add SM103-specific schedulers for NVFP4 CUTLASS kernels
#2303
opened Jan 7, 2026 by
LopezCastroRoberto
Loading…
chore: Update XFails Report
automated
maintenance
testing
#2287
opened Jan 5, 2026 by
flashinfer-bot
Loading…
feat: Add FP8/NVFP4 quant fusion for MNNVL Allreduce
#2263
opened Dec 24, 2025 by
timlee0212
Loading…
5 tasks done
chore: add __all__ exports to Python modules and document missing APIs
#2251
opened Dec 20, 2025 by
yzh119
Loading…
5 tasks
bugfix: skip CUTLASS kernel generation when AOT cache exists
#2248
opened Dec 19, 2025 by
yongwww
Loading…
3 of 5 tasks
fix: Handle zeros in Mistral Large 3 MoE inference
#2238
opened Dec 18, 2025 by
dbari
Loading…
9 tasks done
refactor: pull trtllm-gen batch-gemm/gemm headers from artifactory; update tma descriptor shape init
#2235
opened Dec 17, 2025 by
jimmyzho
Loading…
5 tasks
Fix: Add mask_indptr conversion in BatchPrefillWithPagedKVCacheWrapper.plan()
#2201
opened Dec 11, 2025 by
Dutch-voyage
Loading…
5 tasks
Add CUDA graph buffers for persistent attention
#2185
opened Dec 7, 2025 by
Edenzzzz
Loading…
5 tasks
[Flashinfer-Bench integration] HF end-to-end inference
#2151
opened Nov 30, 2025 by
sfc-gh-goliaro
•
Draft
5 tasks
Refactor flashinfer/__init__.py so that applications could selectively pack submodules without modifying __init__.py
#2027
opened Nov 3, 2025 by
bangshengtang
Loading…
5 tasks done
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.