VBROADCAST—Broadcast Floating-Point Data

Opcode/Instruction Op/En 64/32-bit Mode CPUID Feature Flag Description

VEX.128.66.0F38.W0 18 /r

VBROADCASTSS xmm1, m32

RM V/V AVX Broadcast single-precision floating-point element in mem to four locations in xmm1.

VEX.256.66.0F38.W0 18 /r

VBROADCASTSS ymm1, m32

RM V/V AVX Broadcast single-precision floating-point element in mem to eight locations in ymm1.

VEX.256.66.0F38.W0 19 /r

VBROADCASTSD ymm1, m64

RM V/V AVX Broadcast double-precision floating-point element in mem to four locations in ymm1.

VEX.256.66.0F38.W0 1A /r

VBROADCASTF128 ymm1, m128

RM V/V AVX Broadcast 128 bits of floating-point data in mem to low and high 128-bits in ymm1.

VEX.128.66.0F38.W0 18/r

VBROADCASTSS xmm1, xmm2

RM V/V AVX2 Broadcast the low single-precision floating-point element in the source operand to four locations in xmm1.

VEX.256.66.0F38.W0 18 /r

VBROADCASTSS ymm1, xmm2

RM V/V AVX2 Broadcast low single-precision floating-point element in the source operand to eight locations in ymm1.

VEX.256.66.0F38.W0 19 /r

VBROADCASTSD ymm1, xmm2

RM V/V AVX2 Broadcast low double-precision floating-point element in the source operand to four locations in ymm1.

Instruction Operand Encoding

Op/En Operand 1 Operand 2 Operand 3 Operand 4
RM ModRM:reg (w) ModRM:r/m (r) NA NA

Description

Load floating point values from the source operand (second operand) and broadcast to all elements of the destina-tion operand (first operand).

VBROADCASTSD and VBROADCASTF128 are only supported as 256-bit wide versions. VBROADCASTSS is supported in both 128-bit and 256-bit wide versions.

Memory and register source operand syntax support of 256-bit instructions depend on the processor’s enumeration of the following conditions with respect to CPUID.1:ECX.AVX[bit 28] and CPUID.(EAX=07H, ECX=0H):EBX.AVX2[bit 5]:

Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b otherwise instructions will #UD. An attempt to execute VBROADCASTSD or VBROADCASTF128 encoded with VEX.L= 0 will cause an #UD exception. Attempts to execute any VBROADCAST* instruction with VEX.W = 1 will cause #UD.

m32 DEST X0 X0 X0 X0 X0 X0 X0 X0 X0

Figure 4-27. VBROADCASTSS Operation (VEX.256 encoded version)

m32 DEST X0 X0 0 X0 X0 X0 0 0 0

Figure 4-28. VBROADCASTSS Operation (128-bit version)

m64 DEST X0 X0 X0 X0 X0

Figure 4-29. VBROADCASTSD Operation

m128 DEST X0 X0 X0

Figure 4-30. VBROADCASTF128 Operation

Operation

VBROADCASTSS (128 bit version)

temp ← SRC[31:0]
DEST[31:0] ← temp
DEST[63:32] ← temp
DEST[95:64] ← temp
DEST[127:96] ← temp
DEST[VLMAX-1:128] ← 0

VBROADCASTSS (VEX.256 encoded version)

temp ← SRC[31:0]
DEST[31:0] ← temp
DEST[63:32] ← temp
DEST[95:64] ← temp
DEST[127:96] ← temp
DEST[159:128] ← temp
DEST[191:160] ← temp
DEST[223:192] ← temp
DEST[255:224] ← temp

VBROADCASTSD (VEX.256 encoded version)

temp ← SRC[63:0]
DEST[63:0] ← temp
DEST[127:64] ← temp
DEST[191:128] ← temp
DEST[255:192] ← temp

VBROADCASTF128

temp ← SRC[127:0]
DEST[127:0] ← temp
DEST[VLMAX-1:128] ← temp

Intel C/C++ Compiler Intrinsic Equivalent

VBROADCASTSS:

__m128 _mm_broadcast_ss(float *a);

VBROADCASTSS:

__m256 _mm256_broadcast_ss(float *a);

VBROADCASTSD:

__m256d _mm256_broadcast_sd(double *a);

VBROADCASTF128:

__m256 _mm256_broadcast_ps(__m128 * a);

VBROADCASTF128:

__m256d _mm256_broadcast_pd(__m128d * a);

Flags Affected

None.

Other Exceptions

See Exceptions Type 6; additionally

#UD

If VEX.L = 0 for VBROADCASTSD,

If VEX.L = 0 for VBROADCASTF128,

If VEX.W = 1.