• Home
  • >
  • Uncategorized
  • >
  • User stories : How readers of this blog are applying their knowledge to build applications

User stories : How readers of this blog are applying their knowledge to build applications

Many years back, as a student who was getting his hands dirty in the emerging field of computer vision and machine learning, I lived in a constant state of amazement. The fact that a piece of code I wrote could find circles in an image in under 10 seconds (yes,

Many years back, as a student who was getting his hands dirty in the emerging field of computer vision and machine learning, I lived in a constant state of amazement. The fact that a piece of code I wrote could find circles in an image in under 10 seconds (yes, seconds, not milliseconds) made my day! The virtuous cycle of learning and doing and learning more and doing more kicked my mind into high gear. The funny thing is that even after so many years I am still constantly learning and constantly amazed! The neck breaking pace of innovation in CVML ensures we are all going to be life long learners.

We sometimes think that we need to accumulate a lot of knowledge before we can apply that knowledge. Not true! I have enormous respect for people who maximize the return on their current level of knowledge. Today I am presenting stories of readers of this blog, who have taken the initiative to start building. I hope it inspires you to build!

Typing Software for Paralysis Patients

Last December, I received this wonderful email from a reader, Marc. He wrote.

First, thank you so much for your information and knowledge on the topic of computer vision. It has inspired me to successfully write my first computer vision program which allows patients who are experiencing any form of paralysis especially pertaining to vocal and motor skills used for normal communication, to type and communicate with blinking their eyes. It is a fully functional program and I plan to implement a form of predictive text to make it very easy for the patients to string together conversion as if they were actually vocally communicating.

Needless to say, I was very impressed. This young man was not interested in solving a toy problem. He dove right in and took on a difficult problem that will positively make a difference in many people’s lives.

Next, Marc wrote these flattering lines that give me undue credit for his great initiative.

Because of your efforts I was inspired and wrote this software which has successfully earned me a paid internship during the school year and also another during summer with a medical company.

So my tiny little blog that I have so much fun writing is actually helping some people! Words like these are extremely motivating and push me to do better work.

I was pumped after reading the email. I was thinking Marc must be an expert programmer to pick up these things quickly when I noticed a P.S. at the end of his email. It said,

PS. This is my first year of programming… This time last year I couldn’t even tell you what a “for loop” was…. With that said, my lack of experience doesn’t hinder my imagination!

My first reaction was, “Wow!” But on a second thought, I was a bit skeptical. I wanted to see if Marc had a video demo I could look at. A few hours later, he emailed me this video.

After watching the video, I remembered the last line of his email. Take a moment and let it sink in!

my lack of experience doesn’t hinder my imagination

Virtual Keyboard

One fine day, I was on reddit/r/computervision. I noticed a video of a virtual keyboard application someone had built. I started watching the video. It was an impressive demo. Using a virtual keyboard and hand movement, the person in the video had typed the word “learn” until that point. I paused the video to think how they might have implemented the virtual keyboard and after a few minutes of thinking I had a good idea what was going on. I started playing the video again. Imagine my surprise and joy when I noticed the person type an “o”, followed by “p” …. ending with “.com” reading “learnopencv.com”

Later, I found out that person in the video, Stephen Meschke, is a computer vision enthusiast. He had just stumbled upon my blog and found it useful. He is a maker who enjoys working with both bits and atoms. Here has generously shared code for Virtual Keyboard.

Facial Landmark Detection on iOS

Last year, at least three different people contacted me asking for help porting Dlib’s Facial Landmark Detector to iOS. I helped them with some ideas.

One of them was Roi Mulia — a young iOS developer full of enthusiasm and energy. He told me that he wanted to port the facial landmark detector to iOS in his free time. He had absolutely no experience in Computer Vision and was a self-taught programmer.

Enthusiasm is easy to achieve because we are all attracted toward the shiny new toy. But the problem is that enthusiasm lasts for about 2 days and then we are enthusiastic about the new shiny toy.

Enthusiasm backed by focus, commitment, and effort on the hand is a powerful force. Roi kept emailing me with questions. I could not answer a lot of them, but he kept making progress. Ten months after our initial email exchange, Roi contacted me again. This time he was not asking questions, he was promising “huge improvements.” We got on a Skype call and he showed me a live demo of landmark detector working in real time on an iPhone. I asked several probing questions to make sure his code was efficient with both processing and memory. Here is the demo

A few days later, I asked him to sell me the code and he agreed. I can’t go into the exact price I paid for the piece of code, but let’s just say it cost me an arm and two legs. However, I am getting a lot more than just the code in return. There is great satisfaction in knowing that Roi learned from my blog and built something that I am willing to pay for. There is also comfort in knowing that he will not just give me his code, but walk me through his programming choices. He will teach me in a few hours what took him weeks to learn by trial and error.

We just reversed the role of a teacher and a student with this transaction.

I hope you find these stories inspiring and they motivate you to build your own projects. If you have already built something, I would love to hear from you.



Read Next

VideoRAG: Redefining Long-Context Video Comprehension

VideoRAG: Redefining Long-Context Video Comprehension

Discover VideoRAG, a framework that fuses graph-based reasoning and multi-modal retrieval to enhance LLMs' ability to understand multi-hour videos efficiently.

AI Agent in Action: Automating Desktop Tasks with VLMs

AI Agent in Action: Automating Desktop Tasks with VLMs

Learn how to build AI agent from scratch using Moondream3 and Gemini. It is a generic task based agent free from…

The Ultimate Guide To VLM Evaluation Metrics, Datasets, And Benchmarks

The Ultimate Guide To VLM Evaluation Metrics, Datasets, And Benchmarks

Get a comprehensive overview of VLM Evaluation Metrics, Benchmarks and various datasets for tasks like VQA, OCR and Image Captioning.

Subscribe to our Newsletter

Subscribe to our email newsletter to get the latest posts delivered right to your email.

Subscribe to receive the download link, receive updates, and be notified of bug fixes

Which email should I send you the download link?

 

Get Started with OpenCV

Subscribe To Receive

We hate SPAM and promise to keep your email address safe.​