Automatic Speech Recognition (ASR) is a complex domain within AI, serving as a primary medium that echoes the seamless Human-Machine Interactions depicted in films like Ironman (Jarvis) and HER ...
Building MobileViT Image Classification Model from Scratch In Keras 3
In the rapidly evolving field of deep learning, the challenge often lies not just in designing powerful models but also in making them accessible and efficient for practical use, especially on devices ...
Text Summarization using T5: Fine-Tuning and Building Gradio App
The need for efficient text summarization has never been more pressing. Whether you're a student grappling with lengthy research papers or a professional navigating news articles, the ability to ...
Fine Tuning TrOCR – Training TrOCR to Recognize Curved Text
TrOCR (Transformer based Optical Character Recognition) models are some of the best performing OCR models. In our previous article, we analyzed how well they perform on single line printed and ...
TrOCR – Getting Started with Transformer Based OCR
Optical Character Recognition (OCR) has seen several innovations over the years. Its impact on retail, healthcare, banking, and many other industries has been immense. Despite a long history and ...
The Future of Image Recognition is Here: PyTorch Vision Transformers
Welcome to the second part of our series on vision transformer. In the previous post, we introduced the self-attention mechanism in detail from intuitive and mathematical points of view. We also ...