Open
Description
First of all, thank you so much for your interesting repo. Python-GraphBLAS has helped me a lot during my whole project. I just have a small question related to the comparison between Python-GraphBLAS's and MKL's matrix multiplication operation as follows:
In the Sparse * Dense scenario, I have run these 2 operations 10 times for benchmarking on the same input:
Number of threads: 8
X shape: (500091, 2381304)
C shape: (100, 2381304)
#nnz X: 1255206075
#nnz C: 35204619
Density of X: 0.0010540255835157798
Density of C: 0.14783756714808358
Python-GraphBLAS:
X: csr
C: fullc
C = C.T
XC << 0
XC << XC(accum = gb.binary.plus, nthreads=nthreads) << X.mxm(C)
Mean Runtime: 30.749253249168397
Std Runtime: 0.1747113620436487
and
MKL:
X: csr
C: numpy.array(order="F")
C = C.T
XC = sdm.dot_product_mkl(X, C)
Mean Runtime: 17.70107755661011
Std Runtime: 0.04377898424894914
It seems like the one conducted with MKL is more efficient. My question is whether I have used the optimal operation for Python-GraphBLAS? Is there any other way to conduct this operation in more efficient manner with Python-GraphBLAS?
Thank you so much in advance! I'm looking forward to hearing from you soon.
Metadata
Metadata
Assignees
Labels
No labels