Skip to content

ONNX ViT model fails with DNN_BACKEND_INFERENCE_ENGINE in OpenCV, but works in all other cases #27451

@cesarpgouveia

Description

@cesarpgouveia

System Information

Environment

Windows 10

OpenCV: 4.11.0

OpenVINO: 2024.4.0

Model: ViT model (opset 17) exported to ONNX

Input shape: [1, 3, 112, 112]

Detailed description

I'm testing a ViT model exported to ONNX. It runs fine using OpenVINO Runtime (in both Python and C++), and also works in OpenCV when using the DNN_BACKEND_OPENCV backend.

It also works with DNN_BACKEND_INFERENCE_ENGINE if I first convert the ONNX model to OpenVINO IR format (.xml and .bin) using the Model Optimizer.

The issue happens when I use the ONNX file directly with cv::dnn::readNetFromONNX() and set the backend to DNN_BACKEND_INFERENCE_ENGINE. In that case, I get this error:

[ INFO:0@11.880] global op_inf_engine.cpp:133 cv::dnn::detectArmPlugin_ CPU plugin: 13th Gen Intel(R) Core(TM) i7-1370P
OpenCV(4.12.0-dev) Error: Assertion failed (sz == src.get_size()) in cv::dnn::InfEngineNgraphNet::init, file C:\Users\cesar.gouveia\Projects\OpenCV-Package\opencv_mirror\modules\dnn\src\ie_ngraph.cpp, line 256
OpenCV: terminate handler is called! The last OpenCV error is:
OpenCV(4.12.0-dev) Error: Assertion failed (sz == src.get_size()) in cv::dnn::InfEngineNgraphNet::init, file C:\Users\cesar.gouveia\Projects\OpenCV-Package\opencv_mirror\modules\dnn\src\ie_ngraph.cpp, line 256

OpenCV(4.12.0-dev) Error: Assertion failed (sz == src.get_size()) in InfEngineNgraphNet::init
Tested with OpenCV 4.11.0 and OpenVINO 2024.4. The model uses opset 17 and input shape [1, 3, 112, 112]. I garanteed that the dimensions were [1,3,112,112] and not [batch_size, 3, 112, 112] because I know that OpenVINO has limitations with that.

I need the model to remain in ONNX format. Using IR is not a long-term solution. This only breaks when using OpenCV + Inference Engine with ONNX. Please fix or confirm if this is a known limitation.

You can find the model here: https://we.tl/t-u7vryBKkOw

You can find a reproducible code down here.

Thank you!

Steps to reproduce

#include "pch.h"

#include <iostream>
#include <chrono>
#include <fstream>

#include <opencv2/highgui.hpp>
#include <opencv2/imgcodecs.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/dnn.hpp>
#include <opencv2/core.hpp>

std::string imageFilename = "C:/Users/cesar.gouveia/Downloads/1708014478513.jpeg";
std::string modelFilename = "C:/Users/cesar.gouveia/Downloads/vit_model.onnx";
std::string modelXml = "C:/Users/cesar.gouveia/Downloads/vit_ir/vit_model.xml";
std::string modelBin = "C:/Users/cesar.gouveia/Downloads/vit_ir/vit_model.bin";
cv::dnn::Backend targetBackend = cv::dnn::DNN_BACKEND_INFERENCE_ENGINE;
cv::dnn::Target targetDevice = cv::dnn::DNN_TARGET_CPU;

unsigned int numInferences = 100;
cv::Size modelInputSize = cv::Size(112, 112);
cv::ImreadModes imReadMode = cv::IMREAD_COLOR;
std::vector<std::string> inputLayerNames = {"input"};
bool swapRBChannels = false;
unsigned int numChannels = 3;

int main()
{
    cv::dnn::Net net = cv::dnn::readNetFromONNX(modelFilename);

    net.setPreferableBackend(targetBackend);
    net.setPreferableTarget(targetDevice);

    cv::Mat img = cv::imread(imageFilename, imReadMode);

    cv::Mat imgResized;
    cv::resize(img, imgResized, modelInputSize);

    std::vector<cv::Mat> imgBatch = { imgResized };

    cv::Mat blob = cv::dnn::blobFromImages(imgBatch, 1.0, cv::Size(), cv::Scalar(), swapRBChannels, false, CV_32F);

    std::cout << "Blob size: " << blob.size[0] << "x" << blob.size[1] << "x" << blob.size[2] << "x" << blob.size[3] << std::endl;

    net.setInput(blob);

    //for (auto inputLayerName : inputLayerNames)
    //    net.setInput(blob, inputLayerName);

    std::vector<cv::String> unconnectedOutLayerNames = net.getUnconnectedOutLayersNames();

    

    std::vector<cv::Mat> outputs;
    outputs.clear();

    std::chrono::high_resolution_clock::time_point timeLoadModelPlusInference1 = std::chrono::high_resolution_clock::now();

    net.forward(outputs, unconnectedOutLayerNames);

    std::chrono::high_resolution_clock::time_point timeLoadModelPlusInference2 = std::chrono::high_resolution_clock::now();

    std::chrono::duration<double, std::milli> ms_doubleTimeLoadModelPlusInference = timeLoadModelPlusInference2 - timeLoadModelPlusInference1;

    std::cout << "Execution time (load model + inference): " << ms_doubleTimeLoadModelPlusInference.count() << std::endl; // in ms

    std::chrono::high_resolution_clock::time_point time1 = std::chrono::high_resolution_clock::now();

    try
    {
        for (size_t i = 0; i < numInferences; i++)
            net.forward(outputs, unconnectedOutLayerNames);
    }
    catch (std::exception& ex)
    {
        std::cout << ex.what() << std::endl;
    }
    
    std::chrono::high_resolution_clock::time_point time2 = std::chrono::high_resolution_clock::now();

    std::chrono::duration<double, std::milli> ms_double = time2 - time1;

    std::cout << "Execution time inference only: " << ms_double.count() / numInferences << std::endl; // in ms
}

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    pFad - Phonifier reborn

    Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

    Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


    Alternative Proxies:

    Alternative Proxy

    pFad Proxy

    pFad v3 Proxy

    pFad v4 Proxy