Description of App
Transform your iPhone into a visual assistant with real-time object description and visual understanding powered by on-device AI.
HuggingSnap brings smart vision AI to your iPhone. Using our smolvlm2 vision model, the app understands what your camera sees in real-time without sending any data to the cloud.
Just point your camera, ask a question or request a description, and HuggingSnap will identify objects, explain scenes, read text, and make sense of what you're looking at. It's helpful when shopping, traveling, studying, or just exploring your surroundings.
Key Features:
- Processes everything on your phone - your data stays private
- Works in real-time with no delay
- No internet needed - works offline anywhere
- Easy on your battery
- Reads and translates text in images
- Describes scenes for better accessibility
- Search using your camera
- Choose what types of objects to recognize
HuggingSnap turns your iPhone into a helpful visual companion that sees the world with you!
Terms of use & privacy policy:
Terms of use: https://huggingface.co/terms-of-service
Privacy policy: https://huggingface.co/privacy
Comments
I don't need this but...
For those that do try it, or want another tool, i'd recommend contacting the dev and seeing where things go.
Their email is: [email protected]
Has potential…
If the developer can get the accessibility sorted, this could potentially be a great addition to the majority of tools that blind folks use on a daily basis. :-)
My App Store Review where I Listed some Suggestions
We should have the ability to;
• Select HuggingSnap from the Share sheet or access previously captured photos/videos within the app to have them described
• Know whether the front or back camera is currently selected
• View the size of the downloaded model, delete it and download other models to strike a balance between performance and quality, taking into account memory and hardware requirements
• Ask follow-up questions
• Take advantage of dictation/voice mode using the system voice, with customizable parameters, and even Siri integration, if applicable
• Have OCR functionality in multiple languages and scan PDFs
• Have HuggingSnap describe the content on the screen without having to take a screenshot and export it to HuggingSnap manually, to supplement VoiceOver’s Screen Recognition
• Teach HuggingSnap faces and objects and have them labeled by name wherever they’re encountered
• Explore images/scenes by touch (i.e., by moving the finger around the screen), and get audio cues in 3-D to get a better sense of the position/distance/depth of each object
• Enter a “system prompt” to be used for every description
A couple of notes:
1. Just copied the whole thing from the text field and I don't feel like adding all the HTML tags to convert it to a list, unless requested.
2. I figured out soon enough that we could already ask follow-up questions. Also, I thought of suggesting that we be able to capture photos or videos in portrait or landscape mode after submitting the review.
Um it is fully accessible
So I just downloaded the app and the buttons are labeled and the app seems to be accessible from my end.
Right, the accessibility rating and info should be updated
So this is now a robust alternative to other online services, given that I could never manage to get an on-device description from Speakaboo.
Doesn’t work properly for me
It disconnects my AirPods and brings the sound to my phone. not to the loudspeaker of the phone, to the other one. The small speaker. The one I use when talking on the phone. You know which I’m talking about. The one I must stick to my ear in order to hear properly. only I can’t stick my phone to my ear when using the app cause then there is no room between my cheek and my phone in order for me to press the buttons. You get my meaning I’m sure.