Skip to content

Commit 97a51fc

Browse files
author
lshAlgorithm
committed
add comments
Signed-off-by: lshAlgorithm <lishuhuai_brain@163.com>
1 parent 1c50bb0 commit 97a51fc

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

rwkv_operators_wkv_v7.inc

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@
3838
#define MULTADD(x, y, z) x * y + z
3939
#endif
4040

41-
41+
// TODO: This is ONLY for avx256, thus should be put in the macro in a decent way.
4242
inline float horizontal_sum(__m256 vec) {
4343
__m256 sum1 = _mm256_hadd_ps(vec, vec);
4444
__m256 sum2 = _mm256_hadd_ps(sum1, sum1);
@@ -102,7 +102,7 @@ static void rwkv_wkv_v7_impl(struct ggml_tensor * result, const struct ggml_tens
102102
auto v_vec = SET1(v[t_h_i_offset]);
103103
sa_vec = SET1(sa);
104104

105-
auto sum = ZEROS();
105+
auto sum = ZEROS(); // Initialize the sum vector
106106
for (size_t j = 0; j < C / H; j += SIMD_WIDTH) {
107107
size_t t_h_j_offset = t_h_offset + j;
108108
size_t h_2d_i_j_offset = h_2d_i_offset + j;
@@ -127,6 +127,7 @@ static void rwkv_wkv_v7_impl(struct ggml_tensor * result, const struct ggml_tens
127127
// auto sum = LOAD(&result_data[t_h_i_offset]);
128128
sum = ADD(sum, result);
129129
}
130+
// Reduce all elements in the vector in one.
130131
result_data[t_h_i_offset] = horizontal_sum(sum);
131132
}
132133

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy