gcc.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Tamar Christina <tamar.christina@arm.com>	2025-11-19 14:27:55 +0000
committer	Tamar Christina <tamar.christina@arm.com>	2025-11-19 14:27:55 +0000
commit	3027010d8bcc854eb43425cb1da573ff7345a5ac (patch)
tree	21098af4cc868406a83ef9882f5ca4f48abd45a4 /libjava/java
parent	a3e97daf1f7452d060d2e5e4eb2fea7717343f18 (diff)

AArch64: expand extractions of Adv.SIMD registers from SVE as separate insn.

For this example using the Adv.SIMD/SVE Bridge #include <arm_neon.h> #include <arm_neon_sve_bridge.h> #include <stdint.h> svint16_t sub_neon_i16_sve_bridged(svint8_t a, svint8_t b) { return svset_neonq_s16(svundef_s16(), vsubq_s16(vmovl_high_s8(svget_neonq(a)), vmovl_high_s8(svget_neonq(b)))); } we generate: sub_neon_i16_sve_bridged(__SVInt8_t, __SVInt8_t): sxtl2 v0.8h, v0.16b ssubw2 v0.8h, v0.8h, v1.16b ret instead of just sub_neon_i16_sve_bridged(__SVInt8_t, __SVInt8_t): ssubl2 v0.8h, v0.16b, v1.16b ret Commit g:abf865732a7313cf79ffa325faed3467ed28d8b8 added a framework to fold uses of instrinsics combined with lo/hi extractions into the appropriate low or highpart instructions. However this doesn't trigger because the Adv.SIMD from SVE extraction code for vmovl_high_s8(svget_neonq(a)) does not have one argument as constant and only supports folding 2 insn, not 3 into 1. The above in RTL generates (insn 7 4 8 2 (set (reg:V8QI 103 [ _6 ]) (vec_select:V8QI (subreg:V16QI (reg/v:VNx16QI 109 [ a ]) 0) (parallel:V16QI [ (const_int 8 [0x8]) (const_int 9 [0x9]) (const_int 10 [0xa]) (const_int 11 [0xb]) (const_int 12 [0xc]) (const_int 13 [0xd]) (const_int 14 [0xe]) (const_int 15 [0xf]) ]))) "":3174:43 -1 (nil)) Since the SVE and the Adv. SIMD modes are tieable this is a valid instruction to make, however it's suboptimal in that we can't fold this into the existing instruction patterns. Eventually early-ra will split off the SVE reg from the patterns but by then we're passed combine and insn foldings so we miss all the optimizations. This patch introduces vec_extract optabs for 128-bit and 64-bit Adv.SIMD vector extraction from SVE registers and emits an explicit separate instruction for the subregs. This then gives combine and rtl folding the opportunity to form the combined instructions and if not we arrive at the same RTL after early-ra. gcc/ChangeLog: * config/aarch64/aarch64-sve.md (vec_extract<mode><v128>, vec_extract<mode><v64>): New. * config/aarch64/iterators.md (V64, v64): New. * config/aarch64/predicates.md (const0_to_1_operand): New. gcc/testsuite/ChangeLog: * gcc.target/aarch64/simd/fold_to_highpart_6.c: Update codegen. * gcc.target/aarch64/sve/fold_to_highpart_1.c: New test. * gcc.target/aarch64/sve/fold_to_highpart_2.c: New test.

Diffstat (limited to 'libjava/java')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: