flashinfer-ai / flashinfer Public

Notifications You must be signed in to change notification settings
Fork 646
Star 4.6k

Code
Issues 281
Pull requests 66
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: flashinfer-ai/flashinfer

Labels 38 Milestones 2

New pull request New

66 Open 1,669 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Added the cudnn backend Ragged KV Cache wrapper

#2352 opened Jan 14, 2026 by Anerudhan

Loading…

3 tasks done

feat: add per-request generator support for sampling kernels

#2345 opened Jan 13, 2026 by yzh119

Loading…

Optimize quantization function in large problem size

#2343 opened Jan 13, 2026 by Shunkangz

Loading…

5 tasks done

feat: add LSE return support to TRT LLM attention kernels

#2332 opened Jan 11, 2026 by yzh119

Loading…

feat: expose swizzled_input_sf parameter for CUTLASS fused MOE

#2330 opened Jan 11, 2026 by yzh119

Loading…

[wip] feat: add bias support to TGV and CUTLASS BF16 GEMM

#2329 opened Jan 11, 2026 by yzh119

Loading…

[DO NOT MERGE/REVIEW][WIP] feat: add GitHub Actions CI workflow for self-hosted spot runners

#2326 opened Jan 10, 2026 by yongwww • Draft

5 tasks

feat: add batch_invariant option to trtllm decode functions

#2321 opened Jan 9, 2026 by yzh119

Loading…

Fused MoE non gated Relu2 NVFP4 & FP8

#2304 opened Jan 7, 2026 by amitz-nv • Draft

5 tasks

[Perf][Feature] Add SM103-specific schedulers for NVFP4 CUTLASS kernels

#2303 opened Jan 7, 2026 by LopezCastroRoberto

Loading…

chore: Update XFails Report automated maintenance testing

#2287 opened Jan 5, 2026 by flashinfer-bot

Loading…

chore: Update CODEOWNERS automated maintenance

#2286 opened Jan 5, 2026 by flashinfer-bot

Loading…

feat: Add FP8/NVFP4 quant fusion for MNNVL Allreduce

#2263 opened Dec 24, 2025 by timlee0212

Loading…

5 tasks done

chore: add __all__ exports to Python modules and document missing APIs

#2251 opened Dec 20, 2025 by yzh119

Loading…

5 tasks

bugfix: skip CUTLASS kernel generation when AOT cache exists

#2248 opened Dec 19, 2025 by yongwww

Loading…

3 of 5 tasks

fix: Handle zeros in Mistral Large 3 MoE inference

#2238 opened Dec 18, 2025 by dbari

Loading…

9 tasks done

refactor: pull trtllm-gen batch-gemm/gemm headers from artifactory; update tma descriptor shape init

#2235 opened Dec 17, 2025 by jimmyzho

Loading…

5 tasks

misc: support checks unit test tracking

#2224 opened Dec 16, 2025 by jimmyzho

Loading…

5 tasks

Fix: Add mask_indptr conversion in BatchPrefillWithPagedKVCacheWrapper.plan()

#2201 opened Dec 11, 2025 by Dutch-voyage

Loading…

5 tasks

refactor: update fa3 codebase [part 2]

#2192 opened Dec 9, 2025 by yzh119

Loading…

4 of 5 tasks

Add CUDA graph buffers for persistent attention

#2185 opened Dec 7, 2025 by Edenzzzz

Loading…

5 tasks

[Flashinfer-Bench integration] HF end-to-end inference

#2151 opened Nov 30, 2025 by sfc-gh-goliaro • Draft

5 tasks

feat: add sink to flashinfer decode

#2087 opened Nov 13, 2025 by djmmoss

Loading…

Refactor flashinfer/__init__.py so that applications could selectively pack submodules without modifying __init__.py

#2027 opened Nov 3, 2025 by bangshengtang

Loading…

5 tasks done

Blockwise GEMM with all reduce overlapping

#2007 opened Oct 30, 2025 by Amir-19 • Draft

5 tasks

Previous 1 2 3 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!