]> git.feebdaed.xyz Git - 0xmirror/go.git/commit
[dev.simd] cmd/compile: zero only low 128-bit of X15
authorCherry Mui <cherryyz@google.com>
Mon, 8 Dec 2025 17:14:24 +0000 (12:14 -0500)
committerCherry Mui <cherryyz@google.com>
Mon, 8 Dec 2025 22:10:09 +0000 (14:10 -0800)
commitf38e968abafde345fa470cb14d55b6f092af569f
tree6534497ff523ce1abc3faa77ed6819a3fc2b31ec
parent144cf17d2c444a530d7c08c5870dc8e70bec2c72
[dev.simd] cmd/compile: zero only low 128-bit of X15

Zeroing the upper part of X15 may make the CPU think it is
"dirty" and slow down SSE operations. For now, just not zeroing
the upper part, and construct a zero value on the fly if we need
a 256- or 512-bit zero value. Maybe VZEROUPPER works better than
explicitly zeroing X15, but we need to evaluate.

Long term, we probably want to move more things from SSE to AVX.

This essentially undoes CL 698237 and CL 698238, except keeping
using X15 for 128-bit zeroing for SIMD.

Change-Id: I1564e6332c4c57f9721397c92c7c734c5497534c
Reviewed-on: https://go-review.googlesource.com/c/go/+/728240
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: David Chase <drchase@google.com>
12 files changed:
src/cmd/compile/internal/amd64/ssa.go
src/cmd/compile/internal/ssa/_gen/AMD64Ops.go
src/cmd/compile/internal/ssa/opGen.go
src/runtime/asm_amd64.s
src/runtime/race_amd64.s
src/runtime/sys_darwin_amd64.s
src/runtime/sys_dragonfly_amd64.s
src/runtime/sys_freebsd_amd64.s
src/runtime/sys_linux_amd64.s
src/runtime/sys_netbsd_amd64.s
src/runtime/sys_openbsd_amd64.s
src/runtime/sys_windows_amd64.s