Python | LearnOpenCV

Object Detection and Spatial Understanding with VLMs ft. Qwen2.5-VL

August 5, 2025 1 Comment

August 5, 2025 By 1 Comment

Object Detection is predominantly a vision task where we train a vision model, like YOLO, to predict the location of the object along with its class. But still it depends on the pre-trained classes, ...

Bhomik Sharma

July 29, 2025 3 Comments

Agentic AI AI Art Generation Computer Vision Generative AI Generative Models Hugging Face Transformers Multimodal Models Vision Language Models

July 29, 2025 By 3 Comments

Welcome back to our LangGraph series! In our previous post, we explored the fundamental concepts of LangGraph by building a Visual Web Browser Agent that could navigate, see, scroll, and ...

Bhomik Sharma

July 8, 2025 27 Comments

Agentic AI Computer Vision Generative AI Generative Models LLMs VLMs

July 8, 2025 By 27 Comments

Developing intelligent agents, using LLMs like GPT-4o, Gemini, etc., that can perform tasks requiring multiple steps, adapt to changing information, and make decisions is a core challenge in AI ...

Bhomik Sharma

June 20, 2025 Leave a Comment

Generative AI video classification Vision Language Models

June 20, 2025 By Leave a Comment

The domain of video understanding is rapidly evolving, with models capable of interpreting complex actions and interactions within video streams. Meta AI's VJEPA-2 (Video Joint Embedding Predictive ...

Ankan Ghosh

March 4, 2025 2 Comments

Computer Vision Deep Learning Object Detection

March 4, 2025 By 2 Comments

According to World Wildlife Fund assessments, the global biodiversity crisis has reached critical levels, with terrestrial mammal populations declining by 69% since 1970. From Africa’s savannahs to ...

Labhesh Valechha

November 7, 2023 11 Comments

Pose Estimation YOLO

November 7, 2023 By 11 Comments

YOLO-NAS Pose models is the latest contribution to the field of Pose Estimation. Earlier this year, Deci garnered widespread recognition for its groundbreaking object detection foundation model, ...

Object Detection and Spatial Understanding with VLMs ft. Qwen2.5-VL

LangGraph: Building Self-Correcting RAG Agent for Code Generation

Building an Agentic Browser with LangGraph: A Visual Automation and Summarization Pipeline

Optimizing VJEPA-2: Tackling Latency & Context in Real-Time Video Classification Scripts

FineTuning RetinaNet for Wildlife Detection with PyTorch: A Step-by-Step Tutorial

Introducing YOLO-NAS Pose: A Leap in Pose Estimation Technology

Get Started with OpenCV

Subscribe to receive the download link, receive updates, and be notified of bug fixes

Which email should I send you the download link?