Skip to content

Issue/932: add paged attention, paged caching, paged attention prefill operator referencing nvidia implementation#939

Open
spike-zhu wants to merge 3 commits intomainfrom
issue/932
Open

Issue/932: add paged attention, paged caching, paged attention prefill operator referencing nvidia implementation#939
spike-zhu wants to merge 3 commits intomainfrom
issue/932

Conversation

@spike-zhu
Copy link
Contributor

python 测试截图
paged attention:
image

paged_attention_prefill:
image

paged_caching:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant