Hello AppleVis community,
I am Mahmood, and I am excited to share Vision Assistant Pro, a new open-source add-on for NVDA. This tool is designed to bring the intelligence of Google Gemini directly into your screen reader to solve digital challenges that usually require sighted assistance.
It is completely free to use (with your own API key) and focuses on interactivity rather than just static descriptions.
🌟 Key Features:
👁️ Interactive Vision (Object & Full Screen): Unlike standard OCR that just reads text, this feature lets you "see" and "ask."
- Object Vision: Take a snapshot of the specific control (icon, button, image) under your navigator cursor.
- Full Screen Vision: Scan the entire screen layout.
- The Best Part: After the initial description, you can chat with the AI. You are not limited to one description; you can ask follow-up questions like "Is there a save icon?", "Describe the chart in detail," or "What color is the button?"
🧠 Smart Translator (Auto-Swap): Instantly translates selected text. It creates a seamless bilingual experience by automatically detecting languages. If the source matches your target language, it intelligently swaps them.
🎙️ Smart Dictation: A powerful voice typing tool. It doesn't just transcribe; it listens, fixes your grammar, removes stutters (ums and ahs), adds punctuation, and types the polished text directly into your active window.
🔓 CAPTCHA Solver: Struggling with visual codes? Press a shortcut, and the AI will solve the math or read the characters and automatically type the result for you.
📄 Document QA: Have a PDF, TIFF, or text file? You can "chat" with your documents. Ask the AI to summarize them, extract specific data, or explain complex sections.
🛠️ Requirement: You need a free Google Gemini API Key to run this add-on.
📥 Download & Installation: You can download the add-on directly from GitHub:
Download Vision Assistant Pro v1.0 (Direct Link)
Just open the downloaded file and confirm the installation in NVDA.
🔗 Project Source Code: https://github.com/mahmoodhozhabri/VisionAssistantPro
I developed this to help our community become more independent. I would love to hear your feedback and suggestions.
Best regards, Mahmood