In the groundbreaking paper “Attention is all you need”, Transformers architecture was introduced for sequence to sequence tasks in NLP. Models like Bert, GPT were built on the top of Transformers ...
Sapiens: Foundation for Human Vision Models by Meta
Sapiens, a family of foundational Human Vision Models by Rawal et al., from Meta, achieves state-of-the-art results for human centric tasks like 2D pose estimation, body-part segmentation, depth ...
Fine-tuning Faster R-CNN on Sea Rescue Dataset – Small Object Detection: PyTorch
Detecting small objects in aerial imagery, particularly for critical applications like sea rescue, presents unique challenges. Timely detection of people in the water can mean the difference between ...
YOLO Loss Function Part 2: GFL and VFL Loss
In the preceding article, YOLO Loss Functions Part 1, we focused exclusively on SIoU and Focal Loss as the primary loss functions used in the YOLO series of models. In this article, we will dive ...
YOLO Loss Function Part 1: SIoU and Focal Loss
The YOLO (You Only Look Once) series of models, renowned for its real-time object detection capabilities, owes much of its effectiveness to its specialized loss functions. In this article, we delve ...
GradCAM – Enhancing Neural Network Interpretability in the Realm of Explainable AI
With millions of trainable parameters, neural networks have long been considered black boxes. They can produce stunning results, and we often accept the output with very little understanding as to why ...