]> git.feebdaed.xyz Git - 0xmirror/gcc.git/commit
aarch64: add Multi-vector 8-bit floating-point multiply-add long
authorClaudio Bantaloukas <claudio.bantaloukas@arm.com>
Wed, 24 Dec 2025 11:41:26 +0000 (11:41 +0000)
committerClaudio Bantaloukas <claudio.bantaloukas@arm.com>
Wed, 24 Dec 2025 11:49:26 +0000 (11:49 +0000)
commit8da567fce3e3f89c63098280cb376f980f206906
treee8667fa2ac11e15a38d934c8560916bae616bf09
parent954a53dff6b128f0d351fa8d9f2f676acc2467da
aarch64: add Multi-vector 8-bit floating-point multiply-add long

This patch adds support for the following intrinsics when sme-f8f16 is enabled:
  * svmla_lane_za16[_mf8]_vg2x1_fpm
  * svmla_lane_za16[_mf8]_vg2x2_fpm
  * svmla_lane_za16[_mf8]_vg2x4_fpm
  * svmla_za16[_mf8]_vg2x1_fpm
  * svmla[_single]_za16[_mf8]_vg2x2_fpm
  * svmla[_single]_za16[_mf8]_vg2x4_fpm
  * svmla_za16[_mf8]_vg2x2_fpm
  * svmla_za16[_mf8]_vg2x4_fpm

This patch adds support for the following intrinsics when sme-f8f32 is enabled:
  * svmla_lane_za32[_mf8]_vg4x1_fpm
  * svmla_lane_za32[_mf8]_vg4x2_fpm
  * svmla_lane_za32[_mf8]_vg4x4_fpm
  * svmla_za32[_mf8]_vg4x1_fpm
  * svmla[_single]_za32[_mf8]_vg4x2_fpm
  * svmla[_single]_za32[_mf8]_vg4x4_fpm
  * svmla_za32[_mf8]_vg4x2_fpm
  * svmla_za32[_mf8]_vg4x4_fpm

Asm tests for the 32 bit versions follow the blueprint set in
mla_lane_za32_u8_vg4x1.c mla_za32_u8_vg4x1.c and similar.
16 bit versions follow similar patterns modulo differences in allowed offsets.

gcc:
* config/aarch64/aarch64-sme.md
(@aarch64_sme_<optab><SME_ZA_F8F16_32:mode><SME_ZA_FP8_x24:mode>): Add
new define_insn.
(*aarch64_sme_<optab><VNx8HI_ONLY:mode><SME_ZA_FP8_x24:mode>_plus,
*aarch64_sme_<optab><VNx4SI_ONLY:mode><SME_ZA_FP8_x24:mode>_plus,
@aarch64_sme_<optab><SME_ZA_F8F16_32:mode><VNx16QI_ONLY:mode>,
*aarch64_sme_<optab><VNx8HI_ONLY:mode><VNx16QI_ONLY:mode>_plus,
*aarch64_sme_<optab><VNx4SI_ONLY:mode><VNx16QI_ONLY:mode>_plus,
@aarch64_sme_single_<optab><SME_ZA_F8F16_32:mode><SME_ZA_FP8_x24:mode>,
*aarch64_sme_single_<optab><VNx8HI_ONLY:mode><SME_ZA_FP8_x24:mode>_plus,
*aarch64_sme_single_<optab><VNx4SI_ONLY:mode><SME_ZA_FP8_x24:mode>_plus,
@aarch64_sme_lane_<optab><SME_ZA_F8F16_32:mode><SME_ZA_FP8_x124:mode>,
*aarch64_sme_lane_<optab><VNx8HI_ONLY:mode><SME_ZA_FP8_x124:mode>,
*aarch64_sme_lane_<optab><VNx4SI_ONLY:mode><SME_ZA_FP8_x124:mode>):
Likewise.
* config/aarch64/aarch64-sve-builtins-shapes.cc
(struct binary_za_slice_lane_base): Support fpm argument.
(struct binary_za_slice_opt_single_base): Likewise.
* config/aarch64/aarch64-sve-builtins-sme.cc (svmla_za): Extend for fp8.
(svmla_lane_za): Likewise.
* config/aarch64/aarch64-sve-builtins-sme.def (svmla_lane): Add new
DEF_SME_ZA_FUNCTION_GS_FPM entries.
(svmla): Likewise.
* config/aarch64/iterators.md (SME_ZA_F8F16_32): Add new mode iterator.
(SME_ZA_FP8_x24, SME_ZA_FP8_x124): Likewise.
(UNSPEC_SME_FMLAL): Add new unspec.
(za16_offset_range): Add new mode_attr.
(za16_32_long): Likewise.
(za16_32_last_offset): Likewise.
(SME_FP8_TERNARY_SLICE): Add new iterator.
(optab): Add entry for UNSPEC_SME_FMLAL.

gcc/testsuite:

* gcc.target/aarch64/sme2/acle-asm/test_sme2_acle.h: (TEST_ZA_X1,
TEST_ZA_XN, TEST_ZA_SINGLE, TEST_ZA_SINGLE_Z15, TEST_ZA_LANE,
TEST_ZA_LANE_Z15): Add fpm0 parameter.
* gcc.target/aarch64/sve/acle/general-c/binary_za_slice_lane_1.c: Add
tests for variants accepting fpm.
* gcc.target/aarch64/sve/acle/general-c/binary_za_slice_opt_single_1.c:
Likewise.
* gcc.target/aarch64/sme2/acle-asm/mla_lane_za16_mf8_vg2x1.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_lane_za16_mf8_vg2x2.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_lane_za16_mf8_vg2x4.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_lane_za32_mf8_vg4x1.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_lane_za32_mf8_vg4x2.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_lane_za32_mf8_vg4x4.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_za16_mf8_vg2x1.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_za16_mf8_vg2x2.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_za16_mf8_vg2x4.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_za32_mf8_vg4x1.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_za32_mf8_vg4x2.c: New test.
* gcc.target/aarch64/sme2/acle-asm/mla_za32_mf8_vg4x4.c: New test.
20 files changed:
gcc/config/aarch64/aarch64-sme.md
gcc/config/aarch64/aarch64-sve-builtins-shapes.cc
gcc/config/aarch64/aarch64-sve-builtins-sme.cc
gcc/config/aarch64/aarch64-sve-builtins-sme.def
gcc/config/aarch64/iterators.md
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_lane_za16_mf8_vg2x1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_lane_za16_mf8_vg2x2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_lane_za16_mf8_vg2x4.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_lane_za32_mf8_vg4x1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_lane_za32_mf8_vg4x2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_lane_za32_mf8_vg4x4.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_za16_mf8_vg2x1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_za16_mf8_vg2x2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_za16_mf8_vg2x4.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_za32_mf8_vg4x1.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_za32_mf8_vg4x2.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/mla_za32_mf8_vg4x4.c [new file with mode: 0644]
gcc/testsuite/gcc.target/aarch64/sme2/acle-asm/test_sme2_acle.h
gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_slice_lane_1.c
gcc/testsuite/gcc.target/aarch64/sve/acle/general-c/binary_za_slice_opt_single_1.c