Molmo VLM is an exceptional open-source family of Vision-Language models, demonstrating remarkable strengths in tasks like Pointing, Counting, VQA and clock face recognition. What sets Molmo apart ...
LightRAG: Simple and Fast Alternative to GraphRAG for Legal Doc Analysis
LightRAG is an innovative approach based on GraphRAG that combines the attributes of Knowledge Graphs with embedding-based retrieval systems, making it fast as well as performant, achieving SOTA ...
Introduction to Speech to Speech: Most Efficient Form of NLP
We often take out our phones and say, “Hey Siri, play Perfect by Ed Sheeran” or “Ok Google, set an alarm at 7.30 in the morning.” And the work is done on the flow by our phones! But have you ever ...
ColPali: Enhancing Financial Report Analysis with Multimodal RAG and Gemini
ColPali multimodal RAG offers a novel approach for efficient retrieval of elements such as images, tables, charts, and texts by treating each page as an image. This method takes advantage of Vision ...
Retrieval Augmented Generation – RAG with LLMs
In today's information age, we're constantly bombarded with questions. Whether it's researching a historical event, troubleshooting a tech issue, or simply satisfying our curiosity, finding the right ...
Fine-Tuning LLMs using PEFT
Large Language Models (LLMs) have taken the world by storm, demonstrating an uncanny ability to understand and generate human language. However, while they excel at grasping general language patterns, ...