core/portable-simd/crates/core_simd/src/
vector.rs

1use crate::simd::{
2    LaneCount, Mask, MaskElement, SupportedLaneCount, Swizzle,
3    cmp::SimdPartialOrd,
4    num::SimdUint,
5    ptr::{SimdConstPtr, SimdMutPtr},
6};
7
8/// A SIMD vector with the shape of `[T; N]` but the operations of `T`.
9///
10/// `Simd<T, N>` supports the operators (+, *, etc.) that `T` does in "elementwise" fashion.
11/// These take the element at each index from the left-hand side and right-hand side,
12/// perform the operation, then return the result in the same index in a vector of equal size.
13/// However, `Simd` differs from normal iteration and normal arrays:
14/// - `Simd<T, N>` executes `N` operations in a single step with no `break`s
15/// - `Simd<T, N>` can have an alignment greater than `T`, for better mechanical sympathy
16///
17/// By always imposing these constraints on `Simd`, it is easier to compile elementwise operations
18/// into machine instructions that can themselves be executed in parallel.
19///
20/// ```rust
21/// # #![feature(portable_simd)]
22/// # use core::simd::{Simd};
23/// # use core::array;
24/// let a: [i32; 4] = [-2, 0, 2, 4];
25/// let b = [10, 9, 8, 7];
26/// let sum = array::from_fn(|i| a[i] + b[i]);
27/// let prod = array::from_fn(|i| a[i] * b[i]);
28///
29/// // `Simd<T, N>` implements `From<[T; N]>`
30/// let (v, w) = (Simd::from(a), Simd::from(b));
31/// // Which means arrays implement `Into<Simd<T, N>>`.
32/// assert_eq!(v + w, sum.into());
33/// assert_eq!(v * w, prod.into());
34/// ```
35///
36///
37/// `Simd` with integer elements treats operators as wrapping, as if `T` was [`Wrapping<T>`].
38/// Thus, `Simd` does not implement `wrapping_add`, because that is the default behavior.
39/// This means there is no warning on overflows, even in "debug" builds.
40/// For most applications where `Simd` is appropriate, it is "not a bug" to wrap,
41/// and even "debug builds" are unlikely to tolerate the loss of performance.
42/// You may want to consider using explicitly checked arithmetic if such is required.
43/// Division by zero on integers still causes a panic, so
44/// you may want to consider using `f32` or `f64` if that is unacceptable.
45///
46/// [`Wrapping<T>`]: core::num::Wrapping
47///
48/// # Layout
49/// `Simd<T, N>` has a layout similar to `[T; N]` (identical "shapes"), with a greater alignment.
50/// `[T; N]` is aligned to `T`, but `Simd<T, N>` will have an alignment based on both `T` and `N`.
51/// Thus it is sound to [`transmute`] `Simd<T, N>` to `[T; N]` and should optimize to "zero cost",
52/// but the reverse transmutation may require a copy the compiler cannot simply elide.
53///
54/// # ABI "Features"
55/// Due to Rust's safety guarantees, `Simd<T, N>` is currently passed and returned via memory,
56/// not SIMD registers, except as an optimization. Using `#[inline]` on functions that accept
57/// `Simd<T, N>` or return it is recommended, at the cost of code generation time, as
58/// inlining SIMD-using functions can omit a large function prolog or epilog and thus
59/// improve both speed and code size. The need for this may be corrected in the future.
60///
61/// Using `#[inline(always)]` still requires additional care.
62///
63/// # Safe SIMD with Unsafe Rust
64///
65/// Operations with `Simd` are typically safe, but there are many reasons to want to combine SIMD with `unsafe` code.
66/// Care must be taken to respect differences between `Simd` and other types it may be transformed into or derived from.
67/// In particular, the layout of `Simd<T, N>` may be similar to `[T; N]`, and may allow some transmutations,
68/// but references to `[T; N]` are not interchangeable with those to `Simd<T, N>`.
69/// Thus, when using `unsafe` Rust to read and write `Simd<T, N>` through [raw pointers], it is a good idea to first try with
70/// [`read_unaligned`] and [`write_unaligned`]. This is because:
71/// - [`read`] and [`write`] require full alignment (in this case, `Simd<T, N>`'s alignment)
72/// - `Simd<T, N>` is often read from or written to [`[T]`](slice) and other types aligned to `T`
73/// - combining these actions violates the `unsafe` contract and explodes the program into
74///   a puff of **undefined behavior**
75/// - the compiler can implicitly adjust layouts to make unaligned reads or writes fully aligned
76///   if it sees the optimization
77/// - most contemporary processors with "aligned" and "unaligned" read and write instructions
78///   exhibit no performance difference if the "unaligned" variant is aligned at runtime
79///
80/// Less obligations mean unaligned reads and writes are less likely to make the program unsound,
81/// and may be just as fast as stricter alternatives.
82/// When trying to guarantee alignment, [`[T]::as_simd`][as_simd] is an option for
83/// converting `[T]` to `[Simd<T, N>]`, and allows soundly operating on an aligned SIMD body,
84/// but it may cost more time when handling the scalar head and tail.
85/// If these are not enough, it is most ideal to design data structures to be already aligned
86/// to `align_of::<Simd<T, N>>()` before using `unsafe` Rust to read or write.
87/// Other ways to compensate for these facts, like materializing `Simd` to or from an array first,
88/// are handled by safe methods like [`Simd::from_array`] and [`Simd::from_slice`].
89///
90/// [`transmute`]: core::mem::transmute
91/// [raw pointers]: pointer
92/// [`read_unaligned`]: pointer::read_unaligned
93/// [`write_unaligned`]: pointer::write_unaligned
94/// [`read`]: pointer::read
95/// [`write`]: pointer::write
96/// [as_simd]: slice::as_simd
97//
98// NOTE: Accessing the inner array directly in any way (e.g. by using the `.0` field syntax) or
99// directly constructing an instance of the type (i.e. `let vector = Simd(array)`) should be
100// avoided, as it will likely become illegal on `#[repr(simd)]` structs in the future. It also
101// causes rustc to emit illegal LLVM IR in some cases.
102#[repr(simd, packed)]
103pub struct Simd<T, const N: usize>([T; N])
104where
105    LaneCount<N>: SupportedLaneCount,
106    T: SimdElement;
107
108impl<T, const N: usize> Simd<T, N>
109where
110    LaneCount<N>: SupportedLaneCount,
111    T: SimdElement,
112{
113    /// Number of elements in this vector.
114    pub const LEN: usize = N;
115
116    /// Returns the number of elements in this SIMD vector.
117    ///
118    /// # Examples
119    ///
120    /// ```
121    /// # #![feature(portable_simd)]
122    /// # #[cfg(feature = "as_crate")] use core_simd::simd;
123    /// # #[cfg(not(feature = "as_crate"))] use core::simd;
124    /// # use simd::u32x4;
125    /// let v = u32x4::splat(0);
126    /// assert_eq!(v.len(), 4);
127    /// ```
128    #[inline]
129    #[allow(clippy::len_without_is_empty)]
130    pub const fn len(&self) -> usize {
131        Self::LEN
132    }
133
134    /// Constructs a new SIMD vector with all elements set to the given value.
135    ///
136    /// # Examples
137    ///
138    /// ```
139    /// # #![feature(portable_simd)]
140    /// # #[cfg(feature = "as_crate")] use core_simd::simd;
141    /// # #[cfg(not(feature = "as_crate"))] use core::simd;
142    /// # use simd::u32x4;
143    /// let v = u32x4::splat(8);
144    /// assert_eq!(v.as_array(), &[8, 8, 8, 8]);
145    /// ```
146    #[inline]
147    #[rustc_const_unstable(feature = "portable_simd", issue = "86656")]
148    pub const fn splat(value: T) -> Self {
149        const fn splat_const<T, const N: usize>(value: T) -> Simd<T, N>
150        where
151            T: SimdElement,
152            LaneCount<N>: SupportedLaneCount,
153        {
154            Simd::from_array([value; N])
155        }
156
157        fn splat_rt<T, const N: usize>(value: T) -> Simd<T, N>
158        where
159            T: SimdElement,
160            LaneCount<N>: SupportedLaneCount,
161        {
162            // This is preferred over `[value; N]`, since it's explicitly a splat:
163            // https://github.com/rust-lang/rust/issues/97804
164            struct Splat;
165            impl<const N: usize> Swizzle<N> for Splat {
166                const INDEX: [usize; N] = [0; N];
167            }
168
169            Splat::swizzle::<T, 1>(Simd::<T, 1>::from([value]))
170        }
171
172        core::intrinsics::const_eval_select((value,), splat_const, splat_rt)
173    }
174
175    /// Returns an array reference containing the entire SIMD vector.
176    ///
177    /// # Examples
178    ///
179    /// ```
180    /// # #![feature(portable_simd)]
181    /// # use core::simd::{Simd, u64x4};
182    /// let v: u64x4 = Simd::from_array([0, 1, 2, 3]);
183    /// assert_eq!(v.as_array(), &[0, 1, 2, 3]);
184    /// ```
185    #[inline]
186    pub const fn as_array(&self) -> &[T; N] {
187        // SAFETY: `Simd<T, N>` is just an overaligned `[T; N]` with
188        // potential padding at the end, so pointer casting to a
189        // `&[T; N]` is safe.
190        //
191        // NOTE: This deliberately doesn't just use `&self.0`, see the comment
192        // on the struct definition for details.
193        unsafe { &*(self as *const Self as *const [T; N]) }
194    }
195
196    /// Returns a mutable array reference containing the entire SIMD vector.
197    #[inline]
198    pub fn as_mut_array(&mut self) -> &mut [T; N] {
199        // SAFETY: `Simd<T, N>` is just an overaligned `[T; N]` with
200        // potential padding at the end, so pointer casting to a
201        // `&mut [T; N]` is safe.
202        //
203        // NOTE: This deliberately doesn't just use `&mut self.0`, see the comment
204        // on the struct definition for details.
205        unsafe { &mut *(self as *mut Self as *mut [T; N]) }
206    }
207
208    /// Loads a vector from an array of `T`.
209    ///
210    /// This function is necessary since `repr(simd)` has padding for non-power-of-2 vectors (at the time of writing).
211    /// With padding, `read_unaligned` will read past the end of an array of N elements.
212    ///
213    /// # Safety
214    /// Reading `ptr` must be safe, as if by `<*const [T; N]>::read`.
215    #[inline]
216    const unsafe fn load(ptr: *const [T; N]) -> Self {
217        // There are potentially simpler ways to write this function, but this should result in
218        // LLVM `load <N x T>`
219
220        let mut tmp = core::mem::MaybeUninit::<Self>::uninit();
221        // SAFETY: `Simd<T, N>` always contains `N` elements of type `T`.  It may have padding
222        // which does not need to be initialized.  The safety of reading `ptr` is ensured by the
223        // caller.
224        unsafe {
225            core::ptr::copy_nonoverlapping(ptr, tmp.as_mut_ptr().cast(), 1);
226            tmp.assume_init()
227        }
228    }
229
230    /// Store a vector to an array of `T`.
231    ///
232    /// See `load` as to why this function is necessary.
233    ///
234    /// # Safety
235    /// Writing to `ptr` must be safe, as if by `<*mut [T; N]>::write`.
236    #[inline]
237    const unsafe fn store(self, ptr: *mut [T; N]) {
238        // There are potentially simpler ways to write this function, but this should result in
239        // LLVM `store <N x T>`
240
241        // Creating a temporary helps LLVM turn the memcpy into a store.
242        let tmp = self;
243        // SAFETY: `Simd<T, N>` always contains `N` elements of type `T`.  The safety of writing
244        // `ptr` is ensured by the caller.
245        unsafe { core::ptr::copy_nonoverlapping(tmp.as_array(), ptr, 1) }
246    }
247
248    /// Converts an array to a SIMD vector.
249    #[inline]
250    pub const fn from_array(array: [T; N]) -> Self {
251        // SAFETY: `&array` is safe to read.
252        //
253        // FIXME: We currently use a pointer load instead of `transmute_copy` because `repr(simd)`
254        // results in padding for non-power-of-2 vectors (so vectors are larger than arrays).
255        //
256        // NOTE: This deliberately doesn't just use `Self(array)`, see the comment
257        // on the struct definition for details.
258        unsafe { Self::load(&array) }
259    }
260
261    /// Converts a SIMD vector to an array.
262    #[inline]
263    pub const fn to_array(self) -> [T; N] {
264        let mut tmp = core::mem::MaybeUninit::uninit();
265        // SAFETY: writing to `tmp` is safe and initializes it.
266        //
267        // FIXME: We currently use a pointer store instead of `transmute_copy` because `repr(simd)`
268        // results in padding for non-power-of-2 vectors (so vectors are larger than arrays).
269        //
270        // NOTE: This deliberately doesn't just use `self.0`, see the comment
271        // on the struct definition for details.
272        unsafe {
273            self.store(tmp.as_mut_ptr());
274            tmp.assume_init()
275        }
276    }
277
278    /// Converts a slice to a SIMD vector containing `slice[..N]`.
279    ///
280    /// # Panics
281    ///
282    /// Panics if the slice's length is less than the vector's `Simd::N`.
283    /// Use `load_or_default` for an alternative that does not panic.
284    ///
285    /// # Example
286    ///
287    /// ```
288    /// # #![feature(portable_simd)]
289    /// # use core::simd::u32x4;
290    /// let source = vec![1, 2, 3, 4, 5, 6];
291    /// let v = u32x4::from_slice(&source);
292    /// assert_eq!(v.as_array(), &[1, 2, 3, 4]);
293    /// ```
294    #[must_use]
295    #[inline]
296    #[track_caller]
297    pub const fn from_slice(slice: &[T]) -> Self {
298        assert!(
299            slice.len() >= Self::LEN,
300            "slice length must be at least the number of elements"
301        );
302        // SAFETY: We just checked that the slice contains
303        // at least `N` elements.
304        unsafe { Self::load(slice.as_ptr().cast()) }
305    }
306
307    /// Writes a SIMD vector to the first `N` elements of a slice.
308    ///
309    /// # Panics
310    ///
311    /// Panics if the slice's length is less than the vector's `Simd::N`.
312    ///
313    /// # Example
314    ///
315    /// ```
316    /// # #![feature(portable_simd)]
317    /// # #[cfg(feature = "as_crate")] use core_simd::simd;
318    /// # #[cfg(not(feature = "as_crate"))] use core::simd;
319    /// # use simd::u32x4;
320    /// let mut dest = vec![0; 6];
321    /// let v = u32x4::from_array([1, 2, 3, 4]);
322    /// v.copy_to_slice(&mut dest);
323    /// assert_eq!(&dest, &[1, 2, 3, 4, 0, 0]);
324    /// ```
325    #[inline]
326    #[track_caller]
327    pub fn copy_to_slice(self, slice: &mut [T]) {
328        assert!(
329            slice.len() >= Self::LEN,
330            "slice length must be at least the number of elements"
331        );
332        // SAFETY: We just checked that the slice contains
333        // at least `N` elements.
334        unsafe { self.store(slice.as_mut_ptr().cast()) }
335    }
336
337    /// Reads contiguous elements from `slice`. Elements are read so long as they're in-bounds for
338    /// the `slice`. Otherwise, the default value for the element type is returned.
339    ///
340    /// # Examples
341    /// ```
342    /// # #![feature(portable_simd)]
343    /// # #[cfg(feature = "as_crate")] use core_simd::simd;
344    /// # #[cfg(not(feature = "as_crate"))] use core::simd;
345    /// # use simd::Simd;
346    /// let vec: Vec<i32> = vec![10, 11];
347    ///
348    /// let result = Simd::<i32, 4>::load_or_default(&vec);
349    /// assert_eq!(result, Simd::from_array([10, 11, 0, 0]));
350    /// ```
351    #[must_use]
352    #[inline]
353    pub fn load_or_default(slice: &[T]) -> Self
354    where
355        T: Default,
356    {
357        Self::load_or(slice, Default::default())
358    }
359
360    /// Reads contiguous elements from `slice`. Elements are read so long as they're in-bounds for
361    /// the `slice`. Otherwise, the corresponding value from `or` is passed through.
362    ///
363    /// # Examples
364    /// ```
365    /// # #![feature(portable_simd)]
366    /// # #[cfg(feature = "as_crate")] use core_simd::simd;
367    /// # #[cfg(not(feature = "as_crate"))] use core::simd;
368    /// # use simd::Simd;
369    /// let vec: Vec<i32> = vec![10, 11];
370    /// let or = Simd::from_array([-5, -4, -3, -2]);
371    ///
372    /// let result = Simd::load_or(&vec, or);
373    /// assert_eq!(result, Simd::from_array([10, 11, -3, -2]));
374    /// ```
375    #[must_use]
376    #[inline]
377    pub fn load_or(slice: &[T], or: Self) -> Self {
378        Self::load_select(slice, Mask::splat(true), or)
379    }
380
381    /// Reads contiguous elements from `slice`. Each element is read from memory if its
382    /// corresponding element in `enable` is `true`.
383    ///
384    /// When the element is disabled or out of bounds for the slice, that memory location
385    /// is not accessed and the corresponding value from `or` is passed through.
386    ///
387    /// # Examples
388    /// ```
389    /// # #![feature(portable_simd)]
390    /// # #[cfg(feature = "as_crate")] use core_simd::simd;
391    /// # #[cfg(not(feature = "as_crate"))] use core::simd;
392    /// # use simd::{Simd, Mask};
393    /// let vec: Vec<i32> = vec![10, 11, 12, 13, 14, 15, 16, 17, 18];
394    /// let enable = Mask::from_array([true, true, false, true]);
395    /// let or = Simd::from_array([-5, -4, -3, -2]);
396    ///
397    /// let result = Simd::load_select(&vec, enable, or);
398    /// assert_eq!(result, Simd::from_array([10, 11, -3, 13]));
399    /// ```
400    #[must_use]
401    #[inline]
402    pub fn load_select_or_default(slice: &[T], enable: Mask<<T as SimdElement>::Mask, N>) -> Self
403    where
404        T: Default,
405    {
406        Self::load_select(slice, enable, Default::default())
407    }
408
409    /// Reads contiguous elements from `slice`. Each element is read from memory if its
410    /// corresponding element in `enable` is `true`.
411    ///
412    /// When the element is disabled or out of bounds for the slice, that memory location
413    /// is not accessed and the corresponding value from `or` is passed through.
414    ///
415    /// # Examples
416    /// ```
417    /// # #![feature(portable_simd)]
418    /// # #[cfg(feature = "as_crate")] use core_simd::simd;
419    /// # #[cfg(not(feature = "as_crate"))] use core::simd;
420    /// # use simd::{Simd, Mask};
421    /// let vec: Vec<i32> = vec![10, 11, 12, 13, 14, 15, 16, 17, 18];
422    /// let enable = Mask::from_array([true, true, false, true]);
423    /// let or = Simd::from_array([-5, -4, -3, -2]);
424    ///
425    /// let result = Simd::load_select(&vec, enable, or);
426    /// assert_eq!(result, Simd::from_array([10, 11, -3, 13]));
427    /// ```
428    #[must_use]
429    #[inline]
430    pub fn load_select(
431        slice: &[T],
432        mut enable: Mask<<T as SimdElement>::Mask, N>,
433        or: Self,
434    ) -> Self {
435        enable &= mask_up_to(slice.len());
436        // SAFETY: We performed the bounds check by updating the mask. &[T] is properly aligned to
437        // the element.
438        unsafe { Self::load_select_ptr(slice.as_ptr(), enable, or) }
439    }
440
441    /// Reads contiguous elements from `slice`. Each element is read from memory if its
442    /// corresponding element in `enable` is `true`.
443    ///
444    /// When the element is disabled, that memory location is not accessed and the corresponding
445    /// value from `or` is passed through.
446    ///
447    /// # Safety
448    /// Enabled loads must not exceed the length of `slice`.
449    #[must_use]
450    #[inline]
451    pub unsafe fn load_select_unchecked(
452        slice: &[T],
453        enable: Mask<<T as SimdElement>::Mask, N>,
454        or: Self,
455    ) -> Self {
456        let ptr = slice.as_ptr();
457        // SAFETY: The safety of reading elements from `slice` is ensured by the caller.
458        unsafe { Self::load_select_ptr(ptr, enable, or) }
459    }
460
461    /// Reads contiguous elements starting at `ptr`. Each element is read from memory if its
462    /// corresponding element in `enable` is `true`.
463    ///
464    /// When the element is disabled, that memory location is not accessed and the corresponding
465    /// value from `or` is passed through.
466    ///
467    /// # Safety
468    /// Enabled `ptr` elements must be safe to read as if by `std::ptr::read`.
469    #[must_use]
470    #[inline]
471    pub unsafe fn load_select_ptr(
472        ptr: *const T,
473        enable: Mask<<T as SimdElement>::Mask, N>,
474        or: Self,
475    ) -> Self {
476        // SAFETY: The safety of reading elements through `ptr` is ensured by the caller.
477        unsafe { core::intrinsics::simd::simd_masked_load(enable.to_int(), ptr, or) }
478    }
479
480    /// Reads from potentially discontiguous indices in `slice` to construct a SIMD vector.
481    /// If an index is out-of-bounds, the element is instead selected from the `or` vector.
482    ///
483    /// # Examples
484    /// ```
485    /// # #![feature(portable_simd)]
486    /// # use core::simd::Simd;
487    /// let vec: Vec<i32> = vec![10, 11, 12, 13, 14, 15, 16, 17, 18];
488    /// let idxs = Simd::from_array([9, 3, 0, 5]);  // Note the index that is out-of-bounds
489    /// let alt = Simd::from_array([-5, -4, -3, -2]);
490    ///
491    /// let result = Simd::gather_or(&vec, idxs, alt);
492    /// assert_eq!(result, Simd::from_array([-5, 13, 10, 15]));
493    /// ```
494    #[must_use]
495    #[inline]
496    pub fn gather_or(slice: &[T], idxs: Simd<usize, N>, or: Self) -> Self {
497        Self::gather_select(slice, Mask::splat(true), idxs, or)
498    }
499
500    /// Reads from indices in `slice` to construct a SIMD vector.
501    /// If an index is out-of-bounds, the element is set to the default given by `T: Default`.
502    ///
503    /// # Examples
504    /// ```
505    /// # #![feature(portable_simd)]
506    /// # use core::simd::Simd;
507    /// let vec: Vec<i32> = vec![10, 11, 12, 13, 14, 15, 16, 17, 18];
508    /// let idxs = Simd::from_array([9, 3, 0, 5]);  // Note the index that is out-of-bounds
509    ///
510    /// let result = Simd::gather_or_default(&vec, idxs);
511    /// assert_eq!(result, Simd::from_array([0, 13, 10, 15]));
512    /// ```
513    #[must_use]
514    #[inline]
515    pub fn gather_or_default(slice: &[T], idxs: Simd<usize, N>) -> Self
516    where
517        T: Default,
518    {
519        Self::gather_or(slice, idxs, Self::splat(T::default()))
520    }
521
522    /// Reads from indices in `slice` to construct a SIMD vector.
523    /// The mask `enable`s all `true` indices and disables all `false` indices.
524    /// If an index is disabled or is out-of-bounds, the element is selected from the `or` vector.
525    ///
526    /// # Examples
527    /// ```
528    /// # #![feature(portable_simd)]
529    /// # use core::simd::{Simd, Mask};
530    /// let vec: Vec<i32> = vec![10, 11, 12, 13, 14, 15, 16, 17, 18];
531    /// let idxs = Simd::from_array([9, 3, 0, 5]); // Includes an out-of-bounds index
532    /// let alt = Simd::from_array([-5, -4, -3, -2]);
533    /// let enable = Mask::from_array([true, true, true, false]); // Includes a masked element
534    ///
535    /// let result = Simd::gather_select(&vec, enable, idxs, alt);
536    /// assert_eq!(result, Simd::from_array([-5, 13, 10, -2]));
537    /// ```
538    #[must_use]
539    #[inline]
540    pub fn gather_select(
541        slice: &[T],
542        enable: Mask<isize, N>,
543        idxs: Simd<usize, N>,
544        or: Self,
545    ) -> Self {
546        let enable: Mask<isize, N> = enable & idxs.simd_lt(Simd::splat(slice.len()));
547        // Safety: We have masked-off out-of-bounds indices.
548        unsafe { Self::gather_select_unchecked(slice, enable, idxs, or) }
549    }
550
551    /// Reads from indices in `slice` to construct a SIMD vector.
552    /// The mask `enable`s all `true` indices and disables all `false` indices.
553    /// If an index is disabled, the element is selected from the `or` vector.
554    ///
555    /// # Safety
556    ///
557    /// Calling this function with an `enable`d out-of-bounds index is *[undefined behavior]*
558    /// even if the resulting value is not used.
559    ///
560    /// # Examples
561    /// ```
562    /// # #![feature(portable_simd)]
563    /// # #[cfg(feature = "as_crate")] use core_simd::simd;
564    /// # #[cfg(not(feature = "as_crate"))] use core::simd;
565    /// # use simd::{Simd, cmp::SimdPartialOrd, Mask};
566    /// let vec: Vec<i32> = vec![10, 11, 12, 13, 14, 15, 16, 17, 18];
567    /// let idxs = Simd::from_array([9, 3, 0, 5]); // Includes an out-of-bounds index
568    /// let alt = Simd::from_array([-5, -4, -3, -2]);
569    /// let enable = Mask::from_array([true, true, true, false]); // Includes a masked element
570    /// // If this mask was used to gather, it would be unsound. Let's fix that.
571    /// let enable = enable & idxs.simd_lt(Simd::splat(vec.len()));
572    ///
573    /// // The out-of-bounds index has been masked, so it's safe to gather now.
574    /// let result = unsafe { Simd::gather_select_unchecked(&vec, enable, idxs, alt) };
575    /// assert_eq!(result, Simd::from_array([-5, 13, 10, -2]));
576    /// ```
577    /// [undefined behavior]: https://doc.rust-lang.org/reference/behavior-considered-undefined.html
578    #[must_use]
579    #[inline]
580    #[cfg_attr(miri, track_caller)] // even without panics, this helps for Miri backtraces
581    pub unsafe fn gather_select_unchecked(
582        slice: &[T],
583        enable: Mask<isize, N>,
584        idxs: Simd<usize, N>,
585        or: Self,
586    ) -> Self {
587        let base_ptr = Simd::<*const T, N>::splat(slice.as_ptr());
588        // Ferris forgive me, I have done pointer arithmetic here.
589        let ptrs = base_ptr.wrapping_add(idxs);
590        // Safety: The caller is responsible for determining the indices are okay to read
591        unsafe { Self::gather_select_ptr(ptrs, enable, or) }
592    }
593
594    /// Reads elementwise from pointers into a SIMD vector.
595    ///
596    /// # Safety
597    ///
598    /// Each read must satisfy the same conditions as [`core::ptr::read`].
599    ///
600    /// # Example
601    /// ```
602    /// # #![feature(portable_simd)]
603    /// # #[cfg(feature = "as_crate")] use core_simd::simd;
604    /// # #[cfg(not(feature = "as_crate"))] use core::simd;
605    /// # use simd::prelude::*;
606    /// let values = [6, 2, 4, 9];
607    /// let offsets = Simd::from_array([1, 0, 0, 3]);
608    /// let source = Simd::splat(values.as_ptr()).wrapping_add(offsets);
609    /// let gathered = unsafe { Simd::gather_ptr(source) };
610    /// assert_eq!(gathered, Simd::from_array([2, 6, 6, 9]));
611    /// ```
612    #[must_use]
613    #[inline]
614    #[cfg_attr(miri, track_caller)] // even without panics, this helps for Miri backtraces
615    pub unsafe fn gather_ptr(source: Simd<*const T, N>) -> Self
616    where
617        T: Default,
618    {
619        // TODO: add an intrinsic that doesn't use a passthru vector, and remove the T: Default bound
620        // Safety: The caller is responsible for upholding all invariants
621        unsafe { Self::gather_select_ptr(source, Mask::splat(true), Self::default()) }
622    }
623
624    /// Conditionally read elementwise from pointers into a SIMD vector.
625    /// The mask `enable`s all `true` pointers and disables all `false` pointers.
626    /// If a pointer is disabled, the element is selected from the `or` vector,
627    /// and no read is performed.
628    ///
629    /// # Safety
630    ///
631    /// Enabled elements must satisfy the same conditions as [`core::ptr::read`].
632    ///
633    /// # Example
634    /// ```
635    /// # #![feature(portable_simd)]
636    /// # #[cfg(feature = "as_crate")] use core_simd::simd;
637    /// # #[cfg(not(feature = "as_crate"))] use core::simd;
638    /// # use simd::prelude::*;
639    /// let values = [6, 2, 4, 9];
640    /// let enable = Mask::from_array([true, true, false, true]);
641    /// let offsets = Simd::from_array([1, 0, 0, 3]);
642    /// let source = Simd::splat(values.as_ptr()).wrapping_add(offsets);
643    /// let gathered = unsafe { Simd::gather_select_ptr(source, enable, Simd::splat(0)) };
644    /// assert_eq!(gathered, Simd::from_array([2, 6, 0, 9]));
645    /// ```
646    #[must_use]
647    #[inline]
648    #[cfg_attr(miri, track_caller)] // even without panics, this helps for Miri backtraces
649    pub unsafe fn gather_select_ptr(
650        source: Simd<*const T, N>,
651        enable: Mask<isize, N>,
652        or: Self,
653    ) -> Self {
654        // Safety: The caller is responsible for upholding all invariants
655        unsafe { core::intrinsics::simd::simd_gather(or, source, enable.to_int()) }
656    }
657
658    /// Conditionally write contiguous elements to `slice`. The `enable` mask controls
659    /// which elements are written, as long as they're in-bounds of the `slice`.
660    /// If the element is disabled or out of bounds, no memory access to that location
661    /// is made.
662    ///
663    /// # Examples
664    /// ```
665    /// # #![feature(portable_simd)]
666    /// # #[cfg(feature = "as_crate")] use core_simd::simd;
667    /// # #[cfg(not(feature = "as_crate"))] use core::simd;
668    /// # use simd::{Simd, Mask};
669    /// let mut arr = [0i32; 4];
670    /// let write = Simd::from_array([-5, -4, -3, -2]);
671    /// let enable = Mask::from_array([false, true, true, true]);
672    ///
673    /// write.store_select(&mut arr[..3], enable);
674    /// assert_eq!(arr, [0, -4, -3, 0]);
675    /// ```
676    #[inline]
677    pub fn store_select(self, slice: &mut [T], mut enable: Mask<<T as SimdElement>::Mask, N>) {
678        enable &= mask_up_to(slice.len());
679        // SAFETY: We performed the bounds check by updating the mask. &[T] is properly aligned to
680        // the element.
681        unsafe { self.store_select_ptr(slice.as_mut_ptr(), enable) }
682    }
683
684    /// Conditionally write contiguous elements to `slice`. The `enable` mask controls
685    /// which elements are written.
686    ///
687    /// # Safety
688    ///
689    /// Every enabled element must be in bounds for the `slice`.
690    ///
691    /// # Examples
692    /// ```
693    /// # #![feature(portable_simd)]
694    /// # #[cfg(feature = "as_crate")] use core_simd::simd;
695    /// # #[cfg(not(feature = "as_crate"))] use core::simd;
696    /// # use simd::{Simd, Mask};
697    /// let mut arr = [0i32; 4];
698    /// let write = Simd::from_array([-5, -4, -3, -2]);
699    /// let enable = Mask::from_array([false, true, true, true]);
700    ///
701    /// unsafe { write.store_select_unchecked(&mut arr, enable) };
702    /// assert_eq!(arr, [0, -4, -3, -2]);
703    /// ```
704    #[inline]
705    pub unsafe fn store_select_unchecked(
706        self,
707        slice: &mut [T],
708        enable: Mask<<T as SimdElement>::Mask, N>,
709    ) {
710        let ptr = slice.as_mut_ptr();
711        // SAFETY: The safety of writing elements in `slice` is ensured by the caller.
712        unsafe { self.store_select_ptr(ptr, enable) }
713    }
714
715    /// Conditionally write contiguous elements starting from `ptr`.
716    /// The `enable` mask controls which elements are written.
717    /// When disabled, the memory location corresponding to that element is not accessed.
718    ///
719    /// # Safety
720    ///
721    /// Memory addresses for element are calculated [`pointer::wrapping_offset`] and
722    /// each enabled element must satisfy the same conditions as [`core::ptr::write`].
723    #[inline]
724    pub unsafe fn store_select_ptr(self, ptr: *mut T, enable: Mask<<T as SimdElement>::Mask, N>) {
725        // SAFETY: The safety of writing elements through `ptr` is ensured by the caller.
726        unsafe { core::intrinsics::simd::simd_masked_store(enable.to_int(), ptr, self) }
727    }
728
729    /// Writes the values in a SIMD vector to potentially discontiguous indices in `slice`.
730    /// If an index is out-of-bounds, the write is suppressed without panicking.
731    /// If two elements in the scattered vector would write to the same index
732    /// only the last element is guaranteed to actually be written.
733    ///
734    /// # Examples
735    /// ```
736    /// # #![feature(portable_simd)]
737    /// # use core::simd::Simd;
738    /// let mut vec: Vec<i32> = vec![10, 11, 12, 13, 14, 15, 16, 17, 18];
739    /// let idxs = Simd::from_array([9, 3, 0, 0]); // Note the duplicate index.
740    /// let vals = Simd::from_array([-27, 82, -41, 124]);
741    ///
742    /// vals.scatter(&mut vec, idxs); // two logical writes means the last wins.
743    /// assert_eq!(vec, vec![124, 11, 12, 82, 14, 15, 16, 17, 18]);
744    /// ```
745    #[inline]
746    pub fn scatter(self, slice: &mut [T], idxs: Simd<usize, N>) {
747        self.scatter_select(slice, Mask::splat(true), idxs)
748    }
749
750    /// Writes values from a SIMD vector to multiple potentially discontiguous indices in `slice`.
751    /// The mask `enable`s all `true` indices and disables all `false` indices.
752    /// If an enabled index is out-of-bounds, the write is suppressed without panicking.
753    /// If two enabled elements in the scattered vector would write to the same index,
754    /// only the last element is guaranteed to actually be written.
755    ///
756    /// # Examples
757    /// ```
758    /// # #![feature(portable_simd)]
759    /// # #[cfg(feature = "as_crate")] use core_simd::simd;
760    /// # #[cfg(not(feature = "as_crate"))] use core::simd;
761    /// # use simd::{Simd, Mask};
762    /// let mut vec: Vec<i32> = vec![10, 11, 12, 13, 14, 15, 16, 17, 18];
763    /// let idxs = Simd::from_array([9, 3, 0, 0]); // Includes an out-of-bounds index
764    /// let vals = Simd::from_array([-27, 82, -41, 124]);
765    /// let enable = Mask::from_array([true, true, true, false]); // Includes a masked element
766    ///
767    /// vals.scatter_select(&mut vec, enable, idxs); // The last write is masked, thus omitted.
768    /// assert_eq!(vec, vec![-41, 11, 12, 82, 14, 15, 16, 17, 18]);
769    /// ```
770    #[inline]
771    pub fn scatter_select(self, slice: &mut [T], enable: Mask<isize, N>, idxs: Simd<usize, N>) {
772        let enable: Mask<isize, N> = enable & idxs.simd_lt(Simd::splat(slice.len()));
773        // Safety: We have masked-off out-of-bounds indices.
774        unsafe { self.scatter_select_unchecked(slice, enable, idxs) }
775    }
776
777    /// Writes values from a SIMD vector to multiple potentially discontiguous indices in `slice`.
778    /// The mask `enable`s all `true` indices and disables all `false` indices.
779    /// If two enabled elements in the scattered vector would write to the same index,
780    /// only the last element is guaranteed to actually be written.
781    ///
782    /// # Safety
783    ///
784    /// Calling this function with an enabled out-of-bounds index is *[undefined behavior]*,
785    /// and may lead to memory corruption.
786    ///
787    /// # Examples
788    /// ```
789    /// # #![feature(portable_simd)]
790    /// # #[cfg(feature = "as_crate")] use core_simd::simd;
791    /// # #[cfg(not(feature = "as_crate"))] use core::simd;
792    /// # use simd::{Simd, cmp::SimdPartialOrd, Mask};
793    /// let mut vec: Vec<i32> = vec![10, 11, 12, 13, 14, 15, 16, 17, 18];
794    /// let idxs = Simd::from_array([9, 3, 0, 0]);
795    /// let vals = Simd::from_array([-27, 82, -41, 124]);
796    /// let enable = Mask::from_array([true, true, true, false]); // Masks the final index
797    /// // If this mask was used to scatter, it would be unsound. Let's fix that.
798    /// let enable = enable & idxs.simd_lt(Simd::splat(vec.len()));
799    ///
800    /// // We have masked the OOB index, so it's safe to scatter now.
801    /// unsafe { vals.scatter_select_unchecked(&mut vec, enable, idxs); }
802    /// // The second write to index 0 was masked, thus omitted.
803    /// assert_eq!(vec, vec![-41, 11, 12, 82, 14, 15, 16, 17, 18]);
804    /// ```
805    /// [undefined behavior]: https://doc.rust-lang.org/reference/behavior-considered-undefined.html
806    #[inline]
807    #[cfg_attr(miri, track_caller)] // even without panics, this helps for Miri backtraces
808    pub unsafe fn scatter_select_unchecked(
809        self,
810        slice: &mut [T],
811        enable: Mask<isize, N>,
812        idxs: Simd<usize, N>,
813    ) {
814        // Safety: This block works with *mut T derived from &mut 'a [T],
815        // which means it is delicate in Rust's borrowing model, circa 2021:
816        // &mut 'a [T] asserts uniqueness, so deriving &'a [T] invalidates live *mut Ts!
817        // Even though this block is largely safe methods, it must be exactly this way
818        // to prevent invalidating the raw ptrs while they're live.
819        // Thus, entering this block requires all values to use being already ready:
820        // 0. idxs we want to write to, which are used to construct the mask.
821        // 1. enable, which depends on an initial &'a [T] and the idxs.
822        // 2. actual values to scatter (self).
823        // 3. &mut [T] which will become our base ptr.
824        unsafe {
825            // Now Entering ☢️ *mut T Zone
826            let base_ptr = Simd::<*mut T, N>::splat(slice.as_mut_ptr());
827            // Ferris forgive me, I have done pointer arithmetic here.
828            let ptrs = base_ptr.wrapping_add(idxs);
829            // The ptrs have been bounds-masked to prevent memory-unsafe writes insha'allah
830            self.scatter_select_ptr(ptrs, enable);
831            // Cleared ☢️ *mut T Zone
832        }
833    }
834
835    /// Writes pointers elementwise into a SIMD vector.
836    ///
837    /// # Safety
838    ///
839    /// Each write must satisfy the same conditions as [`core::ptr::write`].
840    ///
841    /// # Example
842    /// ```
843    /// # #![feature(portable_simd)]
844    /// # #[cfg(feature = "as_crate")] use core_simd::simd;
845    /// # #[cfg(not(feature = "as_crate"))] use core::simd;
846    /// # use simd::{Simd, ptr::SimdMutPtr};
847    /// let mut values = [0; 4];
848    /// let offset = Simd::from_array([3, 2, 1, 0]);
849    /// let ptrs = Simd::splat(values.as_mut_ptr()).wrapping_add(offset);
850    /// unsafe { Simd::from_array([6, 3, 5, 7]).scatter_ptr(ptrs); }
851    /// assert_eq!(values, [7, 5, 3, 6]);
852    /// ```
853    #[inline]
854    #[cfg_attr(miri, track_caller)] // even without panics, this helps for Miri backtraces
855    pub unsafe fn scatter_ptr(self, dest: Simd<*mut T, N>) {
856        // Safety: The caller is responsible for upholding all invariants
857        unsafe { self.scatter_select_ptr(dest, Mask::splat(true)) }
858    }
859
860    /// Conditionally write pointers elementwise into a SIMD vector.
861    /// The mask `enable`s all `true` pointers and disables all `false` pointers.
862    /// If a pointer is disabled, the write to its pointee is skipped.
863    ///
864    /// # Safety
865    ///
866    /// Enabled pointers must satisfy the same conditions as [`core::ptr::write`].
867    ///
868    /// # Example
869    /// ```
870    /// # #![feature(portable_simd)]
871    /// # #[cfg(feature = "as_crate")] use core_simd::simd;
872    /// # #[cfg(not(feature = "as_crate"))] use core::simd;
873    /// # use simd::{Mask, Simd, ptr::SimdMutPtr};
874    /// let mut values = [0; 4];
875    /// let offset = Simd::from_array([3, 2, 1, 0]);
876    /// let ptrs = Simd::splat(values.as_mut_ptr()).wrapping_add(offset);
877    /// let enable = Mask::from_array([true, true, false, false]);
878    /// unsafe { Simd::from_array([6, 3, 5, 7]).scatter_select_ptr(ptrs, enable); }
879    /// assert_eq!(values, [0, 0, 3, 6]);
880    /// ```
881    #[inline]
882    #[cfg_attr(miri, track_caller)] // even without panics, this helps for Miri backtraces
883    pub unsafe fn scatter_select_ptr(self, dest: Simd<*mut T, N>, enable: Mask<isize, N>) {
884        // Safety: The caller is responsible for upholding all invariants
885        unsafe { core::intrinsics::simd::simd_scatter(self, dest, enable.to_int()) }
886    }
887}
888
889impl<T, const N: usize> Copy for Simd<T, N>
890where
891    LaneCount<N>: SupportedLaneCount,
892    T: SimdElement,
893{
894}
895
896impl<T, const N: usize> Clone for Simd<T, N>
897where
898    LaneCount<N>: SupportedLaneCount,
899    T: SimdElement,
900{
901    #[inline]
902    fn clone(&self) -> Self {
903        *self
904    }
905}
906
907impl<T, const N: usize> Default for Simd<T, N>
908where
909    LaneCount<N>: SupportedLaneCount,
910    T: SimdElement + Default,
911{
912    #[inline]
913    fn default() -> Self {
914        Self::splat(T::default())
915    }
916}
917
918impl<T, const N: usize> PartialEq for Simd<T, N>
919where
920    LaneCount<N>: SupportedLaneCount,
921    T: SimdElement + PartialEq,
922{
923    #[inline]
924    fn eq(&self, other: &Self) -> bool {
925        // Safety: All SIMD vectors are SimdPartialEq, and the comparison produces a valid mask.
926        let mask = unsafe {
927            let tfvec: Simd<<T as SimdElement>::Mask, N> =
928                core::intrinsics::simd::simd_eq(*self, *other);
929            Mask::from_int_unchecked(tfvec)
930        };
931
932        // Two vectors are equal if all elements are equal when compared elementwise
933        mask.all()
934    }
935
936    #[allow(clippy::partialeq_ne_impl)]
937    #[inline]
938    fn ne(&self, other: &Self) -> bool {
939        // Safety: All SIMD vectors are SimdPartialEq, and the comparison produces a valid mask.
940        let mask = unsafe {
941            let tfvec: Simd<<T as SimdElement>::Mask, N> =
942                core::intrinsics::simd::simd_ne(*self, *other);
943            Mask::from_int_unchecked(tfvec)
944        };
945
946        // Two vectors are non-equal if any elements are non-equal when compared elementwise
947        mask.any()
948    }
949}
950
951/// Lexicographic order. For the SIMD elementwise minimum and maximum, use simd_min and simd_max instead.
952impl<T, const N: usize> PartialOrd for Simd<T, N>
953where
954    LaneCount<N>: SupportedLaneCount,
955    T: SimdElement + PartialOrd,
956{
957    #[inline]
958    fn partial_cmp(&self, other: &Self) -> Option<core::cmp::Ordering> {
959        // TODO use SIMD equality
960        self.to_array().partial_cmp(other.as_ref())
961    }
962}
963
964impl<T, const N: usize> Eq for Simd<T, N>
965where
966    LaneCount<N>: SupportedLaneCount,
967    T: SimdElement + Eq,
968{
969}
970
971/// Lexicographic order. For the SIMD elementwise minimum and maximum, use simd_min and simd_max instead.
972impl<T, const N: usize> Ord for Simd<T, N>
973where
974    LaneCount<N>: SupportedLaneCount,
975    T: SimdElement + Ord,
976{
977    #[inline]
978    fn cmp(&self, other: &Self) -> core::cmp::Ordering {
979        // TODO use SIMD equality
980        self.to_array().cmp(other.as_ref())
981    }
982}
983
984impl<T, const N: usize> core::hash::Hash for Simd<T, N>
985where
986    LaneCount<N>: SupportedLaneCount,
987    T: SimdElement + core::hash::Hash,
988{
989    #[inline]
990    fn hash<H>(&self, state: &mut H)
991    where
992        H: core::hash::Hasher,
993    {
994        self.as_array().hash(state)
995    }
996}
997
998// array references
999impl<T, const N: usize> AsRef<[T; N]> for Simd<T, N>
1000where
1001    LaneCount<N>: SupportedLaneCount,
1002    T: SimdElement,
1003{
1004    #[inline]
1005    fn as_ref(&self) -> &[T; N] {
1006        self.as_array()
1007    }
1008}
1009
1010impl<T, const N: usize> AsMut<[T; N]> for Simd<T, N>
1011where
1012    LaneCount<N>: SupportedLaneCount,
1013    T: SimdElement,
1014{
1015    #[inline]
1016    fn as_mut(&mut self) -> &mut [T; N] {
1017        self.as_mut_array()
1018    }
1019}
1020
1021// slice references
1022impl<T, const N: usize> AsRef<[T]> for Simd<T, N>
1023where
1024    LaneCount<N>: SupportedLaneCount,
1025    T: SimdElement,
1026{
1027    #[inline]
1028    fn as_ref(&self) -> &[T] {
1029        self.as_array()
1030    }
1031}
1032
1033impl<T, const N: usize> AsMut<[T]> for Simd<T, N>
1034where
1035    LaneCount<N>: SupportedLaneCount,
1036    T: SimdElement,
1037{
1038    #[inline]
1039    fn as_mut(&mut self) -> &mut [T] {
1040        self.as_mut_array()
1041    }
1042}
1043
1044// vector/array conversion
1045impl<T, const N: usize> From<[T; N]> for Simd<T, N>
1046where
1047    LaneCount<N>: SupportedLaneCount,
1048    T: SimdElement,
1049{
1050    #[inline]
1051    fn from(array: [T; N]) -> Self {
1052        Self::from_array(array)
1053    }
1054}
1055
1056impl<T, const N: usize> From<Simd<T, N>> for [T; N]
1057where
1058    LaneCount<N>: SupportedLaneCount,
1059    T: SimdElement,
1060{
1061    #[inline]
1062    fn from(vector: Simd<T, N>) -> Self {
1063        vector.to_array()
1064    }
1065}
1066
1067impl<T, const N: usize> TryFrom<&[T]> for Simd<T, N>
1068where
1069    LaneCount<N>: SupportedLaneCount,
1070    T: SimdElement,
1071{
1072    type Error = core::array::TryFromSliceError;
1073
1074    #[inline]
1075    fn try_from(slice: &[T]) -> Result<Self, core::array::TryFromSliceError> {
1076        Ok(Self::from_array(slice.try_into()?))
1077    }
1078}
1079
1080impl<T, const N: usize> TryFrom<&mut [T]> for Simd<T, N>
1081where
1082    LaneCount<N>: SupportedLaneCount,
1083    T: SimdElement,
1084{
1085    type Error = core::array::TryFromSliceError;
1086
1087    #[inline]
1088    fn try_from(slice: &mut [T]) -> Result<Self, core::array::TryFromSliceError> {
1089        Ok(Self::from_array(slice.try_into()?))
1090    }
1091}
1092
1093mod sealed {
1094    pub trait Sealed {}
1095}
1096use sealed::Sealed;
1097
1098/// Marker trait for types that may be used as SIMD vector elements.
1099///
1100/// # Safety
1101/// This trait, when implemented, asserts the compiler can monomorphize
1102/// `#[repr(simd)]` structs with the marked type as an element.
1103/// Strictly, it is valid to impl if the vector will not be miscompiled.
1104/// Practically, it is user-unfriendly to impl it if the vector won't compile,
1105/// even when no soundness guarantees are broken by allowing the user to try.
1106pub unsafe trait SimdElement: Sealed + Copy {
1107    /// The mask element type corresponding to this element type.
1108    type Mask: MaskElement;
1109}
1110
1111impl Sealed for u8 {}
1112
1113// Safety: u8 is a valid SIMD element type, and is supported by this API
1114unsafe impl SimdElement for u8 {
1115    type Mask = i8;
1116}
1117
1118impl Sealed for u16 {}
1119
1120// Safety: u16 is a valid SIMD element type, and is supported by this API
1121unsafe impl SimdElement for u16 {
1122    type Mask = i16;
1123}
1124
1125impl Sealed for u32 {}
1126
1127// Safety: u32 is a valid SIMD element type, and is supported by this API
1128unsafe impl SimdElement for u32 {
1129    type Mask = i32;
1130}
1131
1132impl Sealed for u64 {}
1133
1134// Safety: u64 is a valid SIMD element type, and is supported by this API
1135unsafe impl SimdElement for u64 {
1136    type Mask = i64;
1137}
1138
1139impl Sealed for usize {}
1140
1141// Safety: usize is a valid SIMD element type, and is supported by this API
1142unsafe impl SimdElement for usize {
1143    type Mask = isize;
1144}
1145
1146impl Sealed for i8 {}
1147
1148// Safety: i8 is a valid SIMD element type, and is supported by this API
1149unsafe impl SimdElement for i8 {
1150    type Mask = i8;
1151}
1152
1153impl Sealed for i16 {}
1154
1155// Safety: i16 is a valid SIMD element type, and is supported by this API
1156unsafe impl SimdElement for i16 {
1157    type Mask = i16;
1158}
1159
1160impl Sealed for i32 {}
1161
1162// Safety: i32 is a valid SIMD element type, and is supported by this API
1163unsafe impl SimdElement for i32 {
1164    type Mask = i32;
1165}
1166
1167impl Sealed for i64 {}
1168
1169// Safety: i64 is a valid SIMD element type, and is supported by this API
1170unsafe impl SimdElement for i64 {
1171    type Mask = i64;
1172}
1173
1174impl Sealed for isize {}
1175
1176// Safety: isize is a valid SIMD element type, and is supported by this API
1177unsafe impl SimdElement for isize {
1178    type Mask = isize;
1179}
1180
1181impl Sealed for f32 {}
1182
1183// Safety: f32 is a valid SIMD element type, and is supported by this API
1184unsafe impl SimdElement for f32 {
1185    type Mask = i32;
1186}
1187
1188impl Sealed for f64 {}
1189
1190// Safety: f64 is a valid SIMD element type, and is supported by this API
1191unsafe impl SimdElement for f64 {
1192    type Mask = i64;
1193}
1194
1195impl<T> Sealed for *const T {}
1196
1197// Safety: (thin) const pointers are valid SIMD element types, and are supported by this API
1198//
1199// Fat pointers may be supported in the future.
1200unsafe impl<T> SimdElement for *const T
1201where
1202    T: core::ptr::Pointee<Metadata = ()>,
1203{
1204    type Mask = isize;
1205}
1206
1207impl<T> Sealed for *mut T {}
1208
1209// Safety: (thin) mut pointers are valid SIMD element types, and are supported by this API
1210//
1211// Fat pointers may be supported in the future.
1212unsafe impl<T> SimdElement for *mut T
1213where
1214    T: core::ptr::Pointee<Metadata = ()>,
1215{
1216    type Mask = isize;
1217}
1218
1219#[inline]
1220fn lane_indices<const N: usize>() -> Simd<usize, N>
1221where
1222    LaneCount<N>: SupportedLaneCount,
1223{
1224    #![allow(clippy::needless_range_loop)]
1225    let mut index = [0; N];
1226    for i in 0..N {
1227        index[i] = i;
1228    }
1229    Simd::from_array(index)
1230}
1231
1232#[inline]
1233fn mask_up_to<M, const N: usize>(len: usize) -> Mask<M, N>
1234where
1235    LaneCount<N>: SupportedLaneCount,
1236    M: MaskElement,
1237{
1238    let index = lane_indices::<N>();
1239    let max_value: u64 = M::max_unsigned();
1240    macro_rules! case {
1241        ($ty:ty) => {
1242            if N < <$ty>::MAX as usize && max_value as $ty as u64 == max_value {
1243                return index.cast().simd_lt(Simd::splat(len.min(N) as $ty)).cast();
1244            }
1245        };
1246    }
1247    case!(u8);
1248    case!(u16);
1249    case!(u32);
1250    case!(u64);
1251    index.simd_lt(Simd::splat(len)).cast()
1252}
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy