add MATCH_CUDA_MINOR_VERSION, resolve #26965 #26966

braindevices · 2025-02-23T14:29:12Z

fix the OpenCVConfig.cmake when -DMATCH_CUDA_MINOR_VERSION=ON.
add MATCH_CUDA_MINOR_VERSION, when -DMATCH_CUDA_MINOR_VERSION=OFF, OpenCVConfig.cmake only check if the major version of the found cuda toolkit matches the expectation.

The cuda sdk actually have stable API for major version, we do not even need to depend on the minor version. But to keep the old behaviour I make MATCH_CUDA_MINOR_VERSION default to ON.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

… YES; when turn it off it only depends on the major cuda version; this also fix the wrong behavior: need to match cuda patch version when ENABLE_CUDA_FIRST_CLASS_LANGUAGE=ON

…x/fix-cuda-version-dependency

asmorkalov · 2025-02-24T07:14:17Z

cc @cudawarped

asmorkalov · 2025-02-24T10:51:33Z

cmake/OpenCVFindLibsPerf.cmake

+    else()
+        # Do not match minor: range is [major, <major + 1>)
+        math(EXPR new_major "${major} + 1")
+        set(${lower_bound} "${major}" PARENT_SCOPE)
+        set(${upper_bound} "${new_major}" PARENT_SCOPE)
+    endif()


I propose for the case when match_minor is off use range major.minor...major+1.0. NVidia may introduce backward compatible change, but not forward compatible.

that is why the match_minor by default is ON. But the user can choose to ignore minor version.

Personally, I never encounter API incompatibility during minor version update. The case the application breaks due to cuda toolkit update is the PTX, which is actually a problem of system maintenance: you simply should not update cuda toolkit without update driver.

Other kind of breakage is due to bugs introduced in each minor version, but then it is not API breakage, the library still be able to link to whatever minor version. If one knows some version give out wrong executing result they on should just choose a different version when he compile his code, as library provider we should not care.

asmorkalov · 2025-02-24T10:54:22Z

cmake/templates/OpenCVConfig-CUDA.cmake.in

@@ -10,10 +12,10 @@ set(OpenCV_CUDNN_VERSION    "@CUDNN_VERSION@")
 set(OpenCV_USE_CUDNN        "@HAVE_CUDNN@")

 if(NOT CUDA_FOUND)
-  find_host_package(CUDA ${OpenCV_CUDA_VERSION} EXACT REQUIRED)
+  find_host_package(CUDA ${OpenCV_CUDA_VERSION_MIN} EXACT REQUIRED)


From CMake documentation:

The EXACT option requests that the version be matched exactly. This option is incompatible with the specification of a version range.

The option is redundant for the case with softer case, if minor version is included.

as I tested: find_host_package(CUDA 12 EXACT REQUIRED) find whatever 12.x, it does not care the minor
find_host_package(CUDA 12.6 EXACT REQUIRED) find whatever 12.6 it does not care the patch version.
the newly introduced get_version_range does not give you minor version if match_minor==OFF, and give you the exact minor version the CUDA_VERSION_STRING give out if match_minor==ON.

cudawarped · 2025-02-24T12:07:19Z

The cuda sdk actually have stable API for major version, we do not even need to depend on the minor version. But to keep the old behaviour I make MATCH_CUDA_MINOR_VERSION default to ON.

Looking at the compatibility guide doesn't it depend on whether we are statically or dynamically linking?

For shared libraries doesn't

If the application relies on dynamic linking for libraries, then the system should have the right version of such libraries as well.

imply there is no guarantee unless your application has access to the exact same version of the SDK libraries it was built against (i.e. CUDA_VERSION_STRING VERSION_EQUAL OpenCV_CUDA_VERSION)?

For static libraries doesn't everything depend on the driver which has nothing to do with the CUDA toolkit? Is the CUDA toolkit even required on linux when statically linking?

CUDA Compatibility guarantees allow for upgrading only certain components:

Backwards compatibility ensures that a newer NVIDIA driver can be used with an older CUDA Toolkit. This is implicit and most simple way of doing upgrades.

Minor version and forward compatibility ensure that an older NVIDIA driver can be used with a newer CUDA Toolkit.

braindevices · 2025-02-24T12:29:10Z

For shared libraries doesn't

If the application relies on dynamic linking for libraries, then the system should have the right version of such libraries as well.

Did you check the minor version compatibility? cuda runtime >= 11 is:

CUDA 11 and Later Defaults to Minor Version Compatibility
Minor version compatibility has another benefit that offers flexibility in the use and deployment of libraries. Applications that use libraries that support minor version compatibility can be deployed on systems with a different version of the toolkit and libraries without recompiling the application for the difference in the library version. This holds true for both older and newer versions of the libraries provided they are all from the same major release family. Note that libraries themselves have interdependencies that should be considered. For example, each cuDNN version requires a certain version of cuBLAS.

So I disagree, and even to my experience, we build both statically and dynamically linked libs, they are in fact minor version compatible.

For static libraries doesn't everything depend on the driver which has nothing to do with the CUDA toolkit? Is the CUDA toolkit even required on linux when statically linking?

Even though we compile opencv library as static, it still contains the undefined symbols from static cuda library, thus as a library provider, we do need to expose the related static cuda library via interfacing cmake target.

Also emphasize again, by default it match the minor version Currently when use CUDA first class language it matches patch version, which is unacceptable it requires to rebuild the ocv too often on up-to-date system.

I also encourage experienced users to try out MATCH_CUDA_MINOR_VERSION=OFF on there own, to see if there is really a problem.

cudawarped · 2025-02-24T13:01:07Z

So I disagree, and even to my experience, we build both statically and dynamically linked libs, they are in fact minor version compatible.

Is a flag really necessary to keep the old behaviour? A flag made sence if

Applications that use libraries that support minor version compatibility can be deployed on systems with a different version of the toolkit and libraries without recompiling the application for the difference in the library version.

implies that libraries such as npp, cuBLAS and cuFFT's may or may not support minor version compatibility and if they do this will depend on the CUDA Toolkit version (NVRTC supports minor version compatibility from CUDA 11.3 onwards)? If you are 100% sure this is always the case wouldn't it be better to force the behaviour?

Also emphasize again, by default it match the minor version Currently when use CUDA first class language it matches patch version, which is unacceptable it requires to rebuild the ocv too often on up-to-date system.

I agree.

cudawarped · 2025-02-24T13:31:45Z

Even though we compile opencv library as static, it still contains the undefined symbols from static cuda library, thus as a library provider, we do need to expose the related static cuda library via interfacing cmake target.

Which static CUDA library are you refering to?

My understanding is that if we just depend on cudart_static then we don't require the CUDA toolkit only a compatible driver because libcuda.so from the driver implements all the undefined symbols? I was assuming that this is also the case for the static versions of NPP, cuBLAS and cuFFT if they are built with vanilla CUDA?

braindevices · 2025-02-24T14:33:29Z

My understanding is that if we just depend on cudart_static then we don't require the CUDA toolkit only a compatible driver because libcuda.so from the driver implements all the undefined symbols? I was assuming that this is also the case for the static versions of NPP, cuBLAS and cuFFT if they are built with vanilla CUDA?

Let's try an example from this issue:
#26963

If I build a cuda enabled opencv that only need cudart, then the libopencv_core.a actually contains undefined symbols can only be resolved by linking to libcudart_static.a and libopencv_cudev.a. The static library is not self contained, it only pack the symbols implemented by its own code.

So when we use opencv_core in our product exec, during link stage, we still need libcudart_static.a. The final exec then do not need any cudart lib to run. You can distribute it without any cudart lib.

I don't know if I make this clear.

Here you may noticed another inconvenience of current library structure. As long as we build the ocv with cuda, it still requires to have libcudart_static.a to build even the exec itself does not really use any cuda related thing. Ideally this should be separated. But it is another big topic.

CMakeLists.txt

braindevices · 2025-02-24T17:28:02Z

Is a flag really necessary to keep the old behaviour? A flag made sence if
Just try not to break anything

implies that libraries such as npp, cuBLAS and cuFFT's may or may not support minor version compatibility and if they do this will depend on the CUDA Toolkit version (NVRTC supports minor version compatibility from CUDA 11.3 onwards)? If you are 100% sure this is always the case wouldn't it be better to force the behaviour?

I think it means the system package maintainer need to be careful not to mix the cuBLAS and cuDNN libs. It is cuda toolkit internal thing. I am not sure about the NPP and cuFFT so far, because I did not use them intensively. But at least they link fine and according to the document it should be minor version compatible.

The only question I have is can the same major version of CUDA toolkits have mixed cuFFT, NPP major versions?
for example, I always see same major version of libcufft so far:

find /usr/local/cuda* -name 'libcufft.so.*.*'
/usr/local/cuda-12.3/targets/x86_64-linux/lib/libcufft.so.11.0.12.1
/usr/local/cuda-12.8/targets/x86_64-linux/lib/libcufft.so.11.3.3.41

But if it is possible to have libcufft.so.10 for 12.1 or libcufft.so.12 for 12.9, then it won't be compatible I guess. so far I never saw this. Actually if you check the pypi distribution of cuda libraries they actually do not have minor version. I also noticed the cuDNN does not specify minor version of cuda at all, you can only choose cuda11 or cuda12, which may indicate the minor version indeed does not matter.

I actually expecting your team tell me if it could be different? because it is your code making it depends on minor version, I suppose you have special reason for this. Did you observe breakage at some point?

Actually static link to cuda prevent bug fix via update the cuda libs. So I don't really like it.

Dynamic link in other hand, should not have any problem.
So next PR I will actually try to fix https://github.com/opencv/opencv_contrib/blob/ce3c6681c9bf0e5cf46704a6ce0883078bdba074/modules/cudaarithm/CMakeLists.txt#L15

CUDA::cudart_static should be CUDA::cudart${CUDA_LIB_EXT}

further we maybe allow to build libopencv_core.a linking to libcudart.so instead of libcudart.a via some flag like SHARED_CUDA_LIBS

…ion-dependency

asmorkalov · 2025-03-10T14:14:45Z

The PR was discussed on OpenCV core team meeting. The team proposes the do not touch OpenCV build scripts at all, but modify OpenCV CMake config file template. The config may change CUDA/CuDNN/CuBLAS/etc search beahviour with user provided variable.

For example, OpenCV user finds OpenCV like this:

set(OPENCV_STRONG_CUDA_VERSION_CHECK TRUE) # or vice-versa
find_package(OpenCV REQUIRED)

And OpenCV config handles OPENCV_STRONG_CUDA_VERSION_CHECK in the library search.

braindevices added 3 commits February 23, 2025 14:57

add MATCH_CUDA_MINOR_VERSION, to maintain the old behavior default to…

097512e

… YES; when turn it off it only depends on the major cuda version; this also fix the wrong behavior: need to match cuda patch version when ENABLE_CUDA_FIRST_CLASS_LANGUAGE=ON

Merge branch '4.x' of https://github.com/opencv/opencv.git into hotfi…

a50e182

…x/fix-cuda-version-dependency

remove trailing spaces

1fe33b7

opencv-alalek added category: build/install category: gpu/cuda (contrib) OpenCV 4.0+: moved to opencv_contrib labels Feb 23, 2025

asmorkalov self-requested a review February 24, 2025 07:14

asmorkalov self-assigned this Feb 24, 2025

asmorkalov reviewed Feb 24, 2025

View reviewed changes

opencv-alalek reviewed Feb 24, 2025

View reviewed changes

CMakeLists.txt Outdated Show resolved Hide resolved

braindevices added 2 commits February 24, 2025 18:37

fix indentation

455fb87

Merge remote-tracking branch 'upstream/4.x' into hotfix/fix-cuda-vers…

815fc45

…ion-dependency

cudawarped mentioned this pull request Mar 18, 2025

Fix CUDA version match to ignore patch version #27093

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

add MATCH_CUDA_MINOR_VERSION, resolve #26965 #26966

add MATCH_CUDA_MINOR_VERSION, resolve #26965 #26966

Uh oh!

braindevices commented Feb 23, 2025

Uh oh!

asmorkalov commented Feb 24, 2025

Uh oh!

asmorkalov Feb 24, 2025

Uh oh!

braindevices Feb 24, 2025

Uh oh!

asmorkalov Feb 24, 2025

Uh oh!

braindevices Feb 24, 2025 •

edited

Loading

Uh oh!

cudawarped commented Feb 24, 2025 •

edited

Loading

Uh oh!

braindevices commented Feb 24, 2025 •

edited

Loading

Uh oh!

cudawarped commented Feb 24, 2025

Uh oh!

cudawarped commented Feb 24, 2025

Uh oh!

braindevices commented Feb 24, 2025 •

edited

Loading

Uh oh!

Uh oh!

braindevices commented Feb 24, 2025 •

edited

Loading

Uh oh!

asmorkalov commented Mar 10, 2025 •

edited

Loading

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Uh oh!

add MATCH_CUDA_MINOR_VERSION, resolve #26965 #26966

Are you sure you want to change the base?

add MATCH_CUDA_MINOR_VERSION, resolve #26965 #26966

Uh oh!

Conversation

braindevices commented Feb 23, 2025

Pull Request Readiness Checklist

Uh oh!

asmorkalov commented Feb 24, 2025

Uh oh!

asmorkalov Feb 24, 2025

Choose a reason for hiding this comment

Uh oh!

braindevices Feb 24, 2025

Choose a reason for hiding this comment

Uh oh!

asmorkalov Feb 24, 2025

Choose a reason for hiding this comment

Uh oh!

braindevices Feb 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cudawarped commented Feb 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

braindevices commented Feb 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cudawarped commented Feb 24, 2025

Uh oh!

cudawarped commented Feb 24, 2025

Uh oh!

braindevices commented Feb 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

braindevices commented Feb 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asmorkalov commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

braindevices Feb 24, 2025 •

edited

Loading

cudawarped commented Feb 24, 2025 •

edited

Loading

braindevices commented Feb 24, 2025 •

edited

Loading

braindevices commented Feb 24, 2025 •

edited

Loading

braindevices commented Feb 24, 2025 •

edited

Loading

asmorkalov commented Mar 10, 2025 •

edited

Loading