4 releases (breaking)

0.4.0 Jan 14, 2025
0.3.0 Oct 28, 2024
0.2.0 Aug 27, 2024
0.1.1 Jul 19, 2024

#124 in Algorithms

Download history 634/week @ 2024-10-25 623/week @ 2024-11-01 484/week @ 2024-11-08 401/week @ 2024-11-15 537/week @ 2024-11-22 743/week @ 2024-11-29 1072/week @ 2024-12-06 905/week @ 2024-12-13 379/week @ 2024-12-20 274/week @ 2024-12-27 711/week @ 2025-01-03 1182/week @ 2025-01-10 1229/week @ 2025-01-17 1319/week @ 2025-01-24 1623/week @ 2025-01-31 1470/week @ 2025-02-07

5,976 downloads per month
Used in 21 crates (4 directly)

MIT/Apache

1MB
28K SLoC

CubeCL Linear Algebra Library.

The crate contains common linear algebra algorithms.

Algorithms

  • Tiling 2D Matrix Multiplication.

    The kernel is very flexible and can be used on pretty much any hardware.

  • Cooperative Matrix Multiplication.

    The kernel is using Automatic Mixed Precision (AMP) to leverage cooperative matrix-multiply and accumulate instructions. For f32 tensors, the inputs are casted into f16, but the accumulation is still performed in f32. This may cause a small lost in precision, but with way faster execution.

Benchmarks

You can run the benchmarks from the workspace with the following:

cargo bench --bench matmul --features wgpu # for wgpu
cargo bench --bench matmul --features cuda # for cuda

On an RTX 3070 we get the following results:

matmul-wgpu-f32-tiling2d

―――――――― Result ―――――――――
  Samples     100
  Mean        13.289ms
  Variance    28.000ns
  Median      13.271ms
  Min         12.582ms
  Max         13.768ms
―――――――――――――――――――――――――
matmul-cuda-f32-tiling2d

―――――――― Result ―――――――――
  Samples     100
  Mean        12.754ms
  Variance    93.000ns
  Median      12.647ms
  Min         12.393ms
  Max         14.501ms
―――――――――――――――――――――――――
matmul-cuda-f32-cmma

―――――――― Result ―――――――――
  Samples     100
  Mean        4.996ms
  Variance    35.000ns
  Median      5.084ms
  Min         4.304ms
  Max         5.155ms
―――――――――――――――――――――――――

Dependencies

~5–19MB
~200K SLoC

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy