Skip to content

Latest commit

 

History

History
65 lines (47 loc) · 3.86 KB

actionnet.md

File metadata and controls

65 lines (47 loc) · 3.86 KB

Back | Next | Contents
Action Recognition

Action Recognition

Action recognition classifies the activity, behavior, or gesture occuring over a sequence of video frames. The DNNs typically use image classification backbones with an added temporal dimension. For example, the ResNet18-based pre-trained models use a window of 16 frames. You can also skip frames to lengthen the window of time over which the model classifies actions.

The actionNet object takes in one video frame at a time, buffers them as input to the model, and outputs the class with the highest confidence. actionNet can be used from Python and C++.

As examples of using the actionNet class, there are sample programs for C++ and Python:

Running the Example

To run action recognition on a live camera stream or video, pass in a device or file path from the Camera Streaming and Multimedia page.

# C++
$ ./actionnet /dev/video0           # V4L2 camera input, display output (default) 
$ ./actionnet input.mp4 output.mp4  # video file input/output (mp4, mkv, avi, flv)

# Python
$ ./actionnet.py /dev/video0           # V4L2 camera input, display output (default) 
$ ./actionnet.py input.mp4 output.mp4  # video file input/output (mp4, mkv, avi, flv)

Command-Line Arguments

These optional command-line arguments can be used with actionnet/actionnet.py:

  --network=NETWORK    pre-trained model to load, one of the following:
                           * resnet-18 (default)
                           * resnet-34
  --model=MODEL        path to custom model to load (.onnx)
  --labels=LABELS      path to text file containing the labels for each class
  --input-blob=INPUT   name of the input layer (default is 'input')
  --output-blob=OUTPUT name of the output layer (default is 'output')
  --threshold=CONF     minimum confidence threshold for classification (default is 0.01)
  --skip-frames=SKIP   how many frames to skip between classifications (default is 1)

By default, the model will process every-other frame to lengthen the window of time for classifying actions over. You can change this with the --skip-frames parameter (using --skip-frames=0 will process every frame).

Pre-trained Action Recognition Models

Below are the pre-trained action recognition model available, and the associated --network argument to actionnet used for loading them:

Model CLI argument Classes
Action-ResNet18-Kinetics resnet18 1040
Action-ResNet34-Kinetics resnet34 1040

The default is resnet18. These models were trained on the Kinetics 700 and Moments in Time datasets (see here for the list of class labels).

Next | Background Removal
Back | Pose Estimation with PoseNet

© 2016-2021 NVIDIA | Table of Contents

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy