Optimise AdvSIMD acoshf by vectorising the special case.
For values greater than 0x1p64, scale the input down first.
This is because the output will overflow with inputs greater than
or equal to this value as there is a squaring operation in the
algorithm.
To scale, do:
2acosh(sqrt[(x+1)/2])
Because:
acosh(x) = 1/2acosh(2x^2 - 1) for x>=1.
Apply opposite operations in opposite order for x, and you get:
acosh(x) = 2acosh(sqrt[(x+1)/2]).
R.Throughput difference on V2 with GCC@15:
30-49% improvement in special cases.
2% regression in fast pass.