Enable optimized dynamic quantization on aarch64 #126687

jondea · 2024-05-20T15:37:14Z

oneDNN+ACL has optimized kernels for s8s8 matmul, so input is signed. This change leaves behaviour on all other platforms the same. This change requires intel/ideep#313 to go in, and oneDNN 3.5 for the optimized kernels. This change speeds up dynamic quantized linear by ~10x.

Also, do you have a policy on copyright headers? Arm's usual policy when contributing to open source projects is to include a copyright header on any file which is modified. Would this be acceptable? If not, is there somewhere else suitable to note copyright?

cc @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10

oneDNN+ACL has optimized kernels for s8s8 matmul, so input is signed. This change leaves behaviour on all other platforms the same. This change requires intel/ideep#313 to go in, and oneDNN 3.5 for the optimized kernels. This change speeds up dynamic quantized linear by ~10x.

pytorch-bot · 2024-05-20T15:37:17Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/126687

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 9469bae with merge base afda668 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

linux-foundation-easycla · 2024-05-20T15:37:18Z

The committers listed above are authorized under a signed CLA.

✅ login: jondea / name: Jonathan Deakin (9469bae)

jgong5

LGTM

jondea requested review from jerryzh168, salilsdesai, kimishpatel, digantdesai and jianyuh as code owners May 20, 2024 15:37

pytorch-bot bot added module: cpu CPU specific problem (e.g., perf, algorithm) release notes: quantization release notes category labels May 20, 2024

pytorchbot added the open source label May 20, 2024

drisspg added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 20, 2024

jgong5 approved these changes May 30, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable optimized dynamic quantization on aarch64 #126687

Enable optimized dynamic quantization on aarch64 #126687

jondea commented May 20, 2024 •

edited by pytorch-bot bot

pytorch-bot bot commented May 20, 2024 •

edited

linux-foundation-easycla bot commented May 20, 2024 •

edited

jgong5 left a comment

Enable optimized dynamic quantization on aarch64 #126687

Are you sure you want to change the base?

Enable optimized dynamic quantization on aarch64 #126687

Conversation

jondea commented May 20, 2024 • edited by pytorch-bot bot

pytorch-bot bot commented May 20, 2024 • edited

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/126687

✅ No Failures

linux-foundation-easycla bot commented May 20, 2024 • edited

jgong5 left a comment

Choose a reason for hiding this comment

jondea commented May 20, 2024 •

edited by pytorch-bot bot

pytorch-bot bot commented May 20, 2024 •

edited

linux-foundation-easycla bot commented May 20, 2024 •

edited