Deep Learning

CVPR 2024: An Overview and Key Papers

CVPR 2024 showcased groundbreaking AI and computer vision research, highlighting generative image dynamics, advanced 3D modeling, and innovative video editing techniques. OpenCV featured prominently, presenting OpenCV5 and collaborating with leading

3D Computer Vision, Computer Vision, Deep Learning, Diffusion Models, Generative AI

Fine-Tuning YOLOv10 Models on Custom Dataset for Kidney Stone Detection

This research article explains a data-centric fine-tuning approach using YOLOv10 models for kidney stone detection.

Computer Vision, Deep Learning, Medical Imaging, Object Detection, YOLO

YOLOv10: The Dual-Head OG of YOLO Series

YOLOv10 introduces a dual-head architecture for NMS-free training and efficiency-accuracy driven model design. It combines one-to-one and one-to-many label assignments to improve performance without extra computation. YOLOv10 uses lightweight classification

Computer Vision, Deep Learning, Object Detection, YOLO

Fine-tuning Faster R-CNN on Sea Rescue Dataset – Small Object Detection: PyTorch

This research article discusses about how data preparation matters for Fine-tuning Faster R-CNN on aerial small object detection.

Computer Vision, Deep Learning, Object Detection

Mastering Recommendation System: A Complete Guide

Recommendation systems (recommender systems) suggest content based on user preferences and behaviors. This guide explores their types, traditional ML techniques like matrix factorization, and advanced deep learning methods like neural

Beginners, Deep Learning, Tutorial

WhisperX Automatic Speech Recognition (ASR) with Nemo Speaker Diarization : Speech-to-Text

This article presents ASR with Diarization using OpenAI Whisper and Nvidia Nemo Toolkit.

Artificial Intelligence, Deep Learning, Speech Recognition, Transformer Neural Networks

Building MobileViT Image Classification Model from Scratch In Keras 3

In the rapidly evolving field of deep learning, the challenge often lies not just in designing powerful models but also in making them accessible and efficient for practical use, especially

AI Research Papers, CNN, Computer Vision, Convolution, Deep Learning, Keras, Transformer Neural Networks, Vision Transformer

Introduction to Robotics: A Comprehensive Guide to Robotics for Beginners

In this article we discuss a basics of Robotics. A comprehensive guide for anyone starting out in robotics, perception, motion planning and control.

Computer Vision, Deep Learning, Robotics

YOLO Loss Function Part 2: GFL and VFL Loss

In the preceding article, YOLO Loss Functions Part 1, we focused exclusively on SIoU and Focal Loss as the primary loss functions used in the YOLO series of models. In

Computer Vision, Deep Learning, Focal Loss, GFL, Loss Function, Object Detection, SIoU Loss Functions, VFL, YOLO

GradCAM – Enhancing Neural Network Interpretability in the Realm of Explainable AI

With millions of trainable parameters, neural networks have long been considered black boxes. They can produce stunning results, and we often accept the output with very little understanding as to