[WIP] [GSOC] GGUF Importer #27177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft

nklskyoy wants to merge 16 commits into opencv:5.x from nklskyoy:llm-prototype

nklskyoy commented Mar 31, 2025 •

edited

Loading

draft PR for #27176

Implemented a Proof-Of-Concept GGUF parser, which can create attention blocks with current AttentionLayer
Tested creation and forward-path of existing attention layer against onnx

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

nklskyoy added 6 commits

March 30, 2025 13:50


          gguf importer: enable vanill attention parsing

132aec6


          expose readNetFromGGUF fn

f45c5f6


          encapsulate buffer logic into separate struct

07f6068


          fix attn_qkv weight matrix name

493a367


          add test fil

d96ae2c


          fix attention parsing

e887303

asmorkalov added feature category: dnn labels

nklskyoy added 8 commits

April 2, 2025 21:10


          gguf importer call netimpl->prepareForInference();

3cb3c69


          enable parsing of 1D tensors

d55d585


          fix read2DMat

2c3560f


          ggufImporter: proper attention init

a8a10bc


          getTensor fix exception on unsup. Mat shape

87aecd0


          test against single-block pytorch attention

0fcb34c


          Test_GGUFImporter cleanup

184a3da


          major code refactor

538c99b

nklskyoy changed the title ~~Llm prototype~~ [WIP] [GSOC] GGUF Importer

nklskyoy added 2 commits

April 7, 2025 15:37


          code refactor 2

e71705e


          test input naming

f4ce156

asmorkalov reviewed

View reviewed changes

modules/dnn/src/llm/gguf_buffer.cpp

+              GGUFBuffer::GGUFBuffer(const std::string & fileName){
+                  std::ifstream file(fileName, std::ios::binary | std::ios::ate);
+                  if (!file.is_open()) {
+                      throw std::runtime_error("Could not open file: ");

Contributor

asmorkalov Apr 22, 2025

Please use CV_Error, CV_Assert, etc. See https://docs.opencv.org/5.x/db/de0/group__core__utils.html#ga5b48c333c777666e076bd7052799f891

modules/dnn/src/llm/gguf_buffer.cpp

Comment on lines +20 to +22

+                  if (!file.read(reinterpret_cast<char*>(buf.data()), size)) {
+                      throw std::runtime_error("Error reading file: " );
+                  }

Contributor

asmorkalov Apr 22, 2025

The same for error handling.

modules/dnn/test/test_gguf.cpp

Comment on lines +55 to +59

+              GTEST_API_ int main(int argc, char **argv)
+              {
+                  testing::InitGoogleTest(&argc, argv);
+                  return RUN_ALL_TESTS();
+              }

Contributor

asmorkalov Apr 22, 2025

it should not be here. We have genetic test main to handle extra options and environment.

modules/dnn/test/test_gguf.cpp

+              {
+                  // Locate the GGUF file; this should be in the directory dnn/gguf/ (adjust the filename as needed)
+                  std::string ggufModelPath = _tf("mha.gguf", true);
+                  std::string onnxModelPath = _tf("mha.onnx", true);

Contributor

asmorkalov Apr 22, 2025

It makes sense to save the model output as .pb file with ONNX runtime for example, but not relay on OpenCV implementation for ONNX. It's more reliable and easy to catch regressions.

modules/dnn/src/llm/gguf_parser.cpp

Comment on lines +144 to +147

+                  std::vector<int> dims;
+                  for (uint32_t i = 0; i < dim_count; ++i) {
+                      dims.push_back(reader.readSingleValueInt<int64_t>());
+                  }

Contributor

asmorkalov Apr 22, 2025

It's more efficient to use std::vector dims(dim_count); and assign values, rather than call push_back and trigger reallocations.

modules/dnn/src/llm/gguf_parser.cpp

Comment on lines +154 to +156

+                  if (tensor.type != GGML_TYPE_F32) {
+                      throw std::runtime_error("Unsupported tensor type: " + std::to_string(tensor.type));
+                  }

Contributor

asmorkalov Apr 22, 2025

Use CV_Assert and CV_Check

modules/dnn/src/llm/gguf_parser.cpp

+                      tensor.role = get_tensor_role(layerName);
+                      tensor.n_block = blockn;
+                  } else {
+                      throw std::runtime_error("Invalid tensor name format: " + tensor_name);

Contributor

asmorkalov Apr 22, 2025

CV_Error

modules/dnn/src/llm/gguf_parser.cpp

Comment on lines +67 to +68

		if (t.size() == 0) {
		throw std::runtime_error("No input tensors found");

Contributor

asmorkalov Apr 22, 2025

CV_Assert

modules/dnn/src/llm/gguf_parser.cpp

Comment on lines +54 to +56

+                  if (t.size() == 0) {
+                      throw std::runtime_error("No input tensors found");
+                  }

Contributor

asmorkalov Apr 22, 2025

CV_Assert

modules/dnn/src/llm/gguf_parser.cpp

Comment on lines +43 to +44

		throw std::runtime_error(
		"Unsupported tensor dimension: " + std::to_string(dims.size()));

Contributor

asmorkalov Apr 22, 2025

CV_Error

fengyuentau self-requested a review

April 23, 2025 07:39

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: dnn feature

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Alternative Proxies:

Alternative Proxy