cpu: aarch64: conv: move f32 jit_sve_1x1 convolution down the list #4380
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
jit_sve_1x1 conv is slower than ACL for f32 so we move them down the impl list.
This will also fix the regressions in 1x1 convolutions on c8g machines seen here:
https://github.com/uxlfoundation/oneDNN/actions/runs/19658857462
No regressions were observed on c7g or c8g machines on 16 threads with this change.
c7g 16 threads speed changes:
conv
--mode=P --max-ms-per-prb=300 --conv ic256ih14oc1024oh14kh1ph0n"resnet50-v1.5:conv10"conv
--mode=P --max-ms-per-prb=300 --conv ic512ih7oc2048oh7kh1ph0n"resnet50-v1.5:conv14"conv
--mode=P --max-ms-per-prb=300 --conv ic72ih56oc80oh56kh1ph0n"generated-tails-conv:1"conv
--mode=P --max-ms-per-prb=300 --conv ic256ih56iw56oc512oh28ow56kh1kw1sh2sw1ph0pw0n"generated-strided-conv:1"conv
--mode=P --max-ms-per-prb=300 --conv ic72ih56iw54oc80oh56ow18kh1kw1sh1sw3ph0pw0n"generated-strided-conv:7"conv
--mode=P --max-ms-per-prb=300 --conv ic140ih28iw28oc128oh14ow28kh1kw1sh2sw1ph0pw0n"generated-strided-conv:9"conv
--mode=P --max-ms-per-prb=300 --conv ic193ih36iw54oc162oh36ow18kh1kw1sh1sw3ph0pw0n"generated-strided-conv:10"conv
--mode=P --max-ms-per-prb=300 --conv ic542ih32iw48oc124oh8ow48kh1kw1sh4sw1ph0pw0n"generated-strided-conv:12"c8g 16 threads speed changes:
conv
--mode=P --max-ms-per-prb=300 --conv ic256ih14oc1024oh14kh1ph0n"resnet50-v1.5:conv10"conv
--mode=P --max-ms-per-prb=300 --conv ic512ih7oc2048oh7kh1ph0n"resnet50-v1.5:conv14"conv
--mode=P --max-ms-per-prb=300 --conv ic72ih56oc80oh56kh1ph0n"generated-tails-conv:1"conv
--mode=P --max-ms-per-prb=300 --conv ic256ih56iw56oc512oh28ow56kh1kw1sh2sw1ph0pw0n"generated-strided-conv:1"conv
--mode=P --max-ms-per-prb=300 --conv ic72ih56iw54oc80oh56ow18kh1kw1sh1sw3ph0pw0n"generated-strided-conv:7"conv
--mode=P --max-ms-per-prb=300 --conv ic140ih28iw28oc128oh14ow28kh1kw1sh2sw1ph0pw0n"generated-strided-conv:9"conv
--mode=P --max-ms-per-prb=300 --conv ic193ih36iw54oc162oh36ow18kh1kw1sh1sw3ph0pw0n"generated-strided-conv:10"conv
--mode=P --max-ms-per-prb=300 --conv ic542ih32iw48oc124oh8ow48kh1kw1sh4sw1ph0pw0n"generated-strided-conv:12"