VLM for Video Understanding with Spatial and Temporal Context: NVIDIA Cosmos Reason1

June 17, 2025 1 Comment

June 17, 2025 By 1 Comment

NVIDIA's Cosmos Reason1 is a family of Vision Language Models trained to understand the physical world and make decisions for embodied reasoning. What makes Cosmos Reason1, as a promising contender ...

Ankan Ghosh

Jaykumaran

June 5, 2025 1 Comment

Robotics Vision Language Models Vision Transformer

June 5, 2025 By 1 Comment

Imagine you're a robotics enthusiast, a student, or even a seasoned developer, and you've been captivated by the idea of robots that can see, understand our language, and then act on that ...

Jaykumaran

May 20, 2025 Leave a Comment

GPUs PyTorch Training Neural Networks

May 20, 2025 By Leave a Comment

Training modern deep learning models often demands huge compute resources and time. As datasets grow larger and model architecture scale up, training on a single GPU is inefficient and time consuming. ...

Jaykumaran

April 30, 2025 Leave a Comment

3D Computer Vision Classical Computer Vision Feature Matching Homography

April 30, 2025 By Leave a Comment

Iterative Closest Point (ICP) is a widely used classical computer vision algorithm for 2D or 3D point cloud registration. As the name suggests it iteratively improves and minimizes the spatial ...

Jaykumaran

April 22, 2025 Leave a Comment

3D Computer Vision 3D Reconstruction Robotics SLAM

April 22, 2025 By Leave a Comment

MASt3R-SLAM is a truly plug and play monocular dense SLAM pipeline that operates in-the-wild. It is first of its kind real-time SLAM system that leverages MASt3R's 3D Reconstruction priors to achieve ...

Jaykumaran

April 11, 2025 Leave a Comment

Generative AI Robotics Vision Language Models

April 11, 2025 By Leave a Comment

The advent of Generative AI, has fundamentally transformed robotic intelligence, enabling significant strides in how advanced humanoid robots "perceive, reason and act" in the physical world. This ...

VLM for Video Understanding with Spatial and Temporal Context: NVIDIA Cosmos Reason1

SmolVLA: Affordable & Efficient VLA Robotics on Consumer GPUs

Distributed Parallel Training: PyTorch Multi-GPU Setup in Kaggle T4x2

Understanding Iterative Closest Point (ICP) Algorithm with Code

MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors

Vision Language Action Models (VLA) Overview: LeRobot Policies Demo

Get Started with OpenCV

Subscribe to receive the download link, receive updates, and be notified of bug fixes

Which email should I send you the download link?