Edge AI in Action: Technologies and Applications

During this tutorial, we will present how to develop and deploy models for edge AI using as examples hand gesture recognition and object detection applications.

Summary

Edge AI refers to the deployment of artificial intelligence models directly on edge devices—such as smartphones, cameras, sensors, drones, and wearables—enabling them to perform inference locally without continuous reliance on the cloud. This paradigm offers several significant benefits, including reduced latency, enhanced privacy, increased responsiveness, and improved energy efficiency.

However, building performant AI systems for edge environments introduces unique challenges. These include model compression, quantization, distillation, pruning, and runtime optimization, as well as effective orchestration across heterogeneous hardware in hybrid edge–cloud architectures.

In this CVPR 2025 tutorial, we offer a hands-on and practice-oriented guide to designing, optimizing, and deploying deep learning models for edge AI. Focusing on computer vision tasks, we will explore real-world use cases, including hand gesture recognition, object detection, and large language models. The tutorial emphasizes multimodal AI, integrating inputs such as video, images, and text to enable intelligent, interactive systems.

We will demonstrate the use of leading tools and frameworks, including ONNX, Qualcomm SNPE, TensorFlow Lite for Android, vLLM, and Ollama, across a range of hardware platforms, such as Jabra intelligent cameras, Qualcomm dev boards, Android mobile phones, and NVIDIA Jetson AGX Orin. Attendees will gain actionable insights into the end-to-end pipeline of edge AI, from model design to real-time deployment.

Topics

We designed this CVPR 2025 tutorial for researchers, engineers, and practitioners seeking to bring AI capabilities to the edge. While prior experience with edge deployment is not required, attendees should have a foundational understanding of computer vision and deep learning. The session will provide both conceptual insights and practical demonstrations, with a strong emphasis on real-world scenarios and hands-on examples. We structured the tutorial into four key modules:

Introduction to Edge AI: Explore the motivation, core principles, and challenges of running AI at the edge. This module compares edge AI to cloud AI, examining the trade-offs in latency, privacy, power consumption, and performance. Real-world applications—ranging from mobile devices to collaborative cameras—will illustrate the growing relevance and impact of edge AI systems.
Model Deployment for Edge AI: Learn the technical foundations for deploying AI models across a wide range of edge hardware. This module covers model conversion pipelines, inference optimization, and benchmarking across platforms. We will demonstrate the use of TensorFlow Lite, ONNX Runtime, and Qualcomm SNPE to enable fast and efficient deployment of deep learning models on resource-constrained devices.
Accelerating Edge AI with Qualcomm AI Hub: Discover how to leverage the Qualcomm AI Hub for rapid prototyping and deployment. This module presents the ecosystem of pre-optimized AI models and SDKs for Snapdragon platforms, with a focus on real-time computer vision tasks. We will walk through tools for model quantization, runtime selection (CPU, GPU, DSP, NPU), and on-device benchmarking using tools like SNPE SDK and Qualcomm AI Hub.
Resource-Constrained VLM Deployment on Edge AI: This module focuses on techniques for adapting large language models (LLMs) for real-time inference on NVIDIA Jetson Orin devices. Topics include model compression, INT8/INT4 quantization, vLLM integration, and memory-aware scheduling. Real-world use cases—such as low-latency multimodal reasoning and prompt engineering for vision tasks—will be presented, along with performance metrics and deployment workflows.

Schedule

Below, you can find the slides presented during our tutorial. In the future, we will publish a link for the recorded video of our tutorial.

9:00 am (CDT) - 9:10 am (CDT) - Opening by Fabricio Batista Narcizo [Slides]
9:10 am (CDT) - 9:25 am (CDT) - Introduction to Edge AI by Elizabete Munzlinger [Slides]
9:25 am (CDT) - 10:50 am (CDT) - Model Deployment for Edge AI by Fabricio Batista Narcizo, Elizabete Munzlinger and Sai Narsi Reddy Donthi Reddy [Slides]
10:50 am (CDT) - 11:15 am (CDT) - Accelerating Edge AI with Qualcomm AI Hub by Shan Ahmed Shaffi [Slides]
11:15 am (CDT) - 11:50 am (CDT) - Resource-Constrained VLM Deployment on Edge AI by Sai Narsi Reddy Donthi Reddy [Slides]
11:50 am (CDT) - 12:00 pm (CDT) - Closing Remarks and Joint Q&A

Supporting Materials

This section provides additional resources and materials related to our tutorial. You can find code examples, notebooks, trained models, and other relevant information to help you better understand and implement the concepts discussed during the tutorial.

SNPE Optimizer: This repository contains tools and scripts for optimizing deep learning models, with a focus on YOLO-NAS S and YOLO-hagRID models, for deployment on edge devices. [Link]
InferSNPE App: An Android application for real-time camera-based inference using Qualcomm’s SNPE (Snapdragon Neural Processing Engine). [Link]
InferLite App An Android application for real-time camera-based inference using TensorFlow Lite for Google Pixel devices. [Link]
YOLO-NAS S Models (SNPE): Models trained on the COCO dataset, optimized for deployment on edge devices using SNPE Optimizer. [Link]
YOLO-NAS S Models (TFLite): Models trained on the COCO dataset, optimized for deployment on edge devices using TensorFlow Lite. [Link]
YOLO-hagRID Models (SNPE): Models trained on the hagRID dataset, optimized for deployment on edge devices using SNPE Optimizer. [Link]
YOLO-hagRID Models (TFLite): Models trained on the hagRID dataset, optimized for deployment on edge devices using TensorFlow Lite. [Link]

Tutorial Video

Watch the video from our tutorial Edge AI in Action: Practical Approaches to Developing and Deploying Optimized Models from CVPR 2024.

Organizers

The development of multimodal AI systems requires expertise across diverse fields, including computer vision, natural language processing, human-computer interaction, signal processing, and machine learning. In this tutorial, we aim to provide both breadth and depth in multimodal interaction research, offering a comprehensive and interdisciplinary perspective. We aim to inspire the CVPR community into these areas and contribute to their dynamic and rapidly evolving nature.