Skip to content

[WIP] [GSOC] GGUF Importer #27177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 16 commits into
base: 5.x
Choose a base branch
from
Draft

[WIP] [GSOC] GGUF Importer #27177

wants to merge 16 commits into from

Conversation

nklskyoy
Copy link

@nklskyoy nklskyoy commented Mar 31, 2025

draft PR for #27176

  • Implemented a Proof-Of-Concept GGUF parser, which can create attention blocks with current AttentionLayer
  • Tested creation and forward-path of existing attention layer against onnx

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@nklskyoy nklskyoy changed the title Llm prototype [WIP] [GSOC] GGUF Importer Apr 6, 2025
GGUFBuffer::GGUFBuffer(const std::string & fileName){
std::ifstream file(fileName, std::ios::binary | std::ios::ate);
if (!file.is_open()) {
throw std::runtime_error("Could not open file: ");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +20 to +22
if (!file.read(reinterpret_cast<char*>(buf.data()), size)) {
throw std::runtime_error("Error reading file: " );
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same for error handling.

Comment on lines +55 to +59
GTEST_API_ int main(int argc, char **argv)
{
testing::InitGoogleTest(&argc, argv);
return RUN_ALL_TESTS();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should not be here. We have genetic test main to handle extra options and environment.

{
// Locate the GGUF file; this should be in the directory dnn/gguf/ (adjust the filename as needed)
std::string ggufModelPath = _tf("mha.gguf", true);
std::string onnxModelPath = _tf("mha.onnx", true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to save the model output as .pb file with ONNX runtime for example, but not relay on OpenCV implementation for ONNX. It's more reliable and easy to catch regressions.

Comment on lines +144 to +147
std::vector<int> dims;
for (uint32_t i = 0; i < dim_count; ++i) {
dims.push_back(reader.readSingleValueInt<int64_t>());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more efficient to use std::vector dims(dim_count); and assign values, rather than call push_back and trigger reallocations.

Comment on lines +154 to +156
if (tensor.type != GGML_TYPE_F32) {
throw std::runtime_error("Unsupported tensor type: " + std::to_string(tensor.type));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use CV_Assert and CV_Check

tensor.role = get_tensor_role(layerName);
tensor.n_block = blockn;
} else {
throw std::runtime_error("Invalid tensor name format: " + tensor_name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CV_Error

Comment on lines +67 to +68
if (t.size() == 0) {
throw std::runtime_error("No input tensors found");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CV_Assert

Comment on lines +54 to +56
if (t.size() == 0) {
throw std::runtime_error("No input tensors found");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CV_Assert

Comment on lines +43 to +44
throw std::runtime_error(
"Unsupported tensor dimension: " + std::to_string(dims.size()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CV_Error

@fengyuentau fengyuentau self-requested a review April 23, 2025 07:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy