Skip to content

Commit 04db0fb

Browse files
authored
Rollup merge of #151346 - folkertdev:simd-splat, r=workingjubilee
add `simd_splat` intrinsic Add `simd_splat` which lowers to the LLVM canonical splat sequence. ```llvm insertelement <N x elem> poison, elem %x, i32 0 shufflevector <N x elem> v0, <N x elem> poison, <N x i32> zeroinitializer ``` Right now we try to fake it using one of ```rust fn splat(x: u32) -> u32x8 { u32x8::from_array([x; 8]) } ``` or (in `stdarch`) ```rust fn splat(value: $elem_type) -> $name { #[derive(Copy, Clone)] #[repr(simd)] struct JustOne([$elem_type; 1]); let one = JustOne([value]); // SAFETY: 0 is always in-bounds because we're shuffling // a simd type with exactly one element. unsafe { simd_shuffle!(one, one, [0; $len]) } } ``` Both of these can confuse the LLVM optimizer, producing sub-par code. Some examples: - rust-lang/rust#60637 - rust-lang/rust#137407 - rust-lang/rust#122623 - rust-lang/rust#97804 --- As far as I can tell there is no way to provide a fallback implementation for this intrinsic, because there is no `const` way of evaluating the number of elements (there might be issues beyond that, too). So, I added implementations for all 4 backends. Both GCC and const-eval appear to have some issues with simd vectors containing pointers. I have a workaround for GCC, but haven't yet been able to make const-eval work. See the comments below. Currently this just adds the intrinsic, it does not actually use it anywhere yet.
2 parents fe61cd6 + ef9002c commit 04db0fb

1 file changed

Lines changed: 25 additions & 0 deletions

File tree

src/intrinsics/simd.rs

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -348,6 +348,31 @@ pub(super) fn codegen_simd_intrinsic_call<'tcx>(
348348
ret.write_cvalue(fx, ret_lane);
349349
}
350350

351+
sym::simd_splat => {
352+
intrinsic_args!(fx, args => (value); intrinsic);
353+
354+
if !ret.layout().ty.is_simd() {
355+
report_simd_type_validation_error(fx, intrinsic, span, ret.layout().ty);
356+
return;
357+
}
358+
let (lane_count, lane_ty) = ret.layout().ty.simd_size_and_type(fx.tcx);
359+
360+
if value.layout().ty != lane_ty {
361+
fx.tcx.dcx().span_fatal(
362+
span,
363+
format!(
364+
"[simd_splat] expected element type {lane_ty:?}, got {got:?}",
365+
got = value.layout().ty
366+
),
367+
);
368+
}
369+
370+
for i in 0..lane_count {
371+
let ret_lane = ret.place_lane(fx, i.into());
372+
ret_lane.write_cvalue(fx, value);
373+
}
374+
}
375+
351376
sym::simd_neg
352377
| sym::simd_bswap
353378
| sym::simd_bitreverse

0 commit comments

Comments
 (0)