You know, 2 or so years ago I would have never imagined I could just pull up a picture on amazon, if it works correctly, and get it described with BeMyEyes, but I've realised something, sometimes to much description is a bad thing.
If I'm reading a meme it might be useful, but if it's a post from a place like r/sh*tamericanssay, not so much.
I don't want to know that the post has this colour, has a flag, has x amount of upvotes and downvotes and all that, I just want to laugh at the sillly comment and move on with my day; and that's where screen text, i think that's what it's called, comes in.
It doesn't give you all the fancy smancy details, it just let's you know what text is on the screen and stops.
So, what about you? Do you like as much description as possible, or is it situational?
Comments
depends
Hi Brad,
It depends what I'm doing. I always remember when I got an Amazon email saying my parcel had been delivered, but we couldn't actually find it anywhere outside. So I opened the email on my phone, and found the photograph they'd taken as proof of deliverey. So VoiceOver recognised some text in the image, and the text displayed showed the address of one of the houses near my parents', and not my parents where I was staying at the time. So I showed my parents, and was like 'look, they haven't delivered it to ours', and they confirmed it. So yeah, sometimes it just works. I also had to analyse some photos for a job I was applying for, and I actually had the application process open both on my computer and my phone. VoiceOver was describing the images as I went along, and I just wrote my answers on my computer. So yes, VoiceOver's image and text recognition is just quicker.
Somewhere in the middle
Sure, a quick and simple text description can be all you need. Sometimes I like a little more detail. Currently, I am really digging the Be My Eyes quick description shortcut, and the Speakaboo app. That one comes with a whole laundry list of really interesting shortcuts, that are quick and precise, but yet still gives us that little bit of extra, ‘oomph’ to make life just a little bit easier. 🙂
@brian
Oo, i've not checked out speakaboo in a long time, off to do exactly that.
Seeing AI
I used to use seeing AI for the situation you described, but they made it the same with be my eyes plus worse translation into Turkish. I found out about detect text late, but am using it now instead of seeing AI.
@burak
Ah, that's it, detect text.
Yeah, seeing AI and BeMyEyes are very similar LLM wise.
@Brian, I retried speakaboo, it's interesting but not for me, it's cool that we have all these tools though and that people can use the ones that work for them.
So here's what I prefer
This is on Android, where I can do a 3 finger single tap to get detailed instructions from Gemini.
Mostly, I relye on what the default descriptions the app or talkback generates, but, whenever I want details, it's just focus on that image and do a 3 finger tap. Gemini pop ups with a small window having detailed image description for that image.
That's what I want IOS/VO to do. By default, stick to current text/quick descriptions, but give me a quick gesture to get detailed descriptions whenever I need without going through the share sheet.
I am sure someone somewhere has created some shortcut to achieve this.