Like buses, they all turn up at once... LIve video rolling out for Chat GPT advance mode over the next week.

By Oliver, 12 December, 2024

Forum
iOS and iPadOS

Chat GPT, hot on the heels of Gemini 2.0, have announced that video and screen sharing are coming to their advanced mode for plus and enterprise users, the roll out coming this week. Europe, unfortunately, will not be included for now.

https://mashable.com/article/openai-brings-video-to-chatgpt-advanced-voice-mode

Video sharing is the magic we've been waiting for, the ability to point our phone at items and have a real time conversation with Chat GPT about what it is seeing in real time. Andy posted an excellent video back in May showing what it was capable of:

https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.youtube.com/watch%3Fv%3DKwNUJ69RbwY&ved=2ahUKEwjP6N3lgKOKAxVjVkEAHXPnKe8QtwJ6BAgNEAI&usg=AOvVaw2ilEPFvaqP2C6Lu5B0sKqo

This is an example of Be My Eyes integration. Maybe our new benevolent overlords can comment on when this new power will be native to the Be My Eyes app?

All very exciting though. Not got it here in the UK yet, but will be looking over the next few days.

Updated to include external article.

Options

Comments

By Alicia Krage on Friday, December 13, 2024 - 01:44

I just canceled my GPT membership for advanced mode! Any idea if this'll be rolled out to regular members as well?

By Alicia Krage on Friday, December 13, 2024 - 01:44

I just canceled my GPT membership for advanced mode! Any idea if this'll be rolled out to regular members as well? Like people who use the free version?

By inforover on Friday, December 13, 2024 - 01:44

On open ais help page, it says EU, not Europe , so I'm hoping that we'll see it in the UK in the next week. Here is the exact wording:
ā€œVideo, screen share, and image upload capabilities will be available to all Team users and most Plus and Pro users, except for those in the European Union, Switzerland, Iceland, Norway, and Liechtenstein. We expect this rollout to be completed over the next week. Usage of video and screen share capabilities is limited for all eligible plans on a daily basis. Usage of image uploads counts towards your plan't usage limits.ā€
The ful article can be found here:
https://help.openai.com/en/articles/10271060-12-days-of-openai-release-updates

By Oliver on Friday, December 13, 2024 - 01:44

The specificity of EU and those other non-EU european countries does mean it will be out in the UK.

Regarding coming to non-advanced users, I assume so but advanced is always going to be the place the new toys come out first. It probably won't be for a good while. Is advance voice mode available on the free version?

By Stephen on Friday, December 13, 2024 - 01:44

Obsessively checking the app every five minutes šŸ˜‚

By Gokul on Friday, December 13, 2024 - 01:44

Saw the live streme; neither the live video, nor santa is available for me yet; no app update either till now. Hopefully it'll get here in a few days time. And yeah, it'd be interesting to know more about any possible BeMyEyes integration.
So, will the third bus be Llama? what do you all think? Meta is usually a little late to the party but... Will we have live video capabilities in a few months time?

By Stephen on Friday, December 20, 2024 - 01:44

And we are live!!!!!!!!! Omg and all it took was me upgrading to pro and now ChatGPT can see.

By Quinton Williams on Friday, December 20, 2024 - 01:44

Ah, so pro subscribers must be first to get it, which I guess makes sense?
I don't have that kind of money though lol.
Hopefully it isn't too long a wait for the rest of us?

By Oliver on Friday, December 20, 2024 - 01:44

I think LLAMA is getting it soon, but how long that takes to trickle down the glasses is anyone's guess. Also, we are likely to have the same issue with reading text, identifying faces and people and so on. The exciting part of OPEN AI getting it out is the ability to prompt the AI that I'm blind and want things described another way, and that, unless I misunderstood, it will be coming to BME.

Pro? Isn't that Ā£200 a month? Ouch! They did say over the next week, so next six days now but, let's face it, Open AI aren't the most reliable when it comes to promises and deadlines.

By Quinton Williams on Friday, December 20, 2024 - 01:44

Well, I actually have it now!
Just decided to check before going to sleep.
Will do more testing tomorrow.

By inforover on Friday, December 20, 2024 - 01:44

I'm going to test it out soon.

By Gokul on Friday, December 20, 2024 - 01:44

I got access, I turned the camera on and everything, but it says it can't access the camera and that it can't see a thing. Do I need to go into the settings and give the app permissions or something?

By Gokul on Friday, December 20, 2024 - 01:44

Got it to work and here's an interesting observation: the native chat gpt app doesn't do realtime monitoring; it's not that it cannot, rather that it won't as it seems it's a limitation placed on it.
So basically, you can't hail a taxy with the help of gpt, unless the OpenAI-BME partnership is coming up with a blind-specific thing without this limitation inside BME.

By Oliver on Friday, December 20, 2024 - 01:44

Yeah, I found the same. I did a test of asking it to tell me when I dropped a coffee pod on my worktop. it said it saw it but didn't notify me.

Also, it would seem the imaging is quite slow. I heard it was a photograph each second, but not sure if that is the case. It's probably down to server load.

Re, Chat GPT not seeing anything, might be an obvious questions, but are your lights on? It needs light to see. I did the same thing. I'm the weird guy in flat 98 who lives in perpetual darkness, like some vampire.

By Louise on Friday, December 20, 2024 - 01:44

I have the subscription, but not the $200 monthly pro. I think some of you have the same subscription as me, at about $20 monthly. My app had an update yesterday, but I don't seem to have this.
I tried gemini and it was pretty cool, but it would be nice if the GPT app that I pay for would do it.

By Gokul on Friday, December 20, 2024 - 01:44

You should get it. just click switch to voice mode, and it should ask you something basically amounting to do you want the video thing. Even if it doesn't, just look for a video camera button once the voice mode turns on.

By Oliver on Friday, December 20, 2024 - 01:44

I found going into app switcher and closing it before starting it again helped. I also got the santa voice... Utterly deranged!

By Falco on Friday, December 20, 2024 - 01:44

Same observations here.
When I ask: "tell me when you see someone in the camera frame." He comfirms to do this. But when I stand in frond of the camera he doesn't notify me.

Hope this will be possible in the future. But for now: this is freaky cool already.
2 years ago I never thought this will be possible!

By Falco on Friday, December 20, 2024 - 01:44

Hello,

does someone know if there is a way to make a shortcut to start talking wiht de advanced voice mode whit vision thurned on?

By Stephen on Friday, December 20, 2024 - 01:44

This is just the beginning of whatā€™s going to be possible.

By Louise on Friday, December 20, 2024 - 01:44

This is pretty cool, and I bet will get even better.
Thank goodness I can tell it to stop talking like Santa though. I'm just not that cheerful. LOL

It's not super accurate when just describing a sceen yet. It said that the hallway outside my office was the sky. But I know it's just going to get better and better.

By Gokul on Friday, December 20, 2024 - 01:44

the lack of accuracy in seen descriptions etc is mostly because of this lack of calibration; I bet with a bit of specific fine-tuning and training on specific data itself, it'd be super wonderful. In the imminant agentic future, what do you all think of an AI agent for the visually impaired?

By Oliver on Friday, December 20, 2024 - 01:44

It does use memory so might be worth telling it you are blind and then listing off your description requirements.

I suspect it is using an older GPT which is less data heavy because it's taking how ever many pictures per second.

It's the Be My Eyes implementation that will be the next step, I think.

Saying that, the PiccyBot dev is saying they'll (sorry, not sure of gendre), is looking at putting it in a whatsapp chat so you could call it and have it working through your ray-bans, which would be, to quote my inner teenager, sick.

By Stephen on Friday, December 20, 2024 - 01:44

It is not giving me inaccurate or bad scene descriptions at allā€¦ I actually have found however, that some voices actually work better with the vision than others.

By Stephen on Friday, December 20, 2024 - 01:44

Ok guys, if you are having issues, this may be a good place to talk about it without having anything done about it, but the best thing to do is report it through the app. Especially if ChatGPT is not continuously looking out when you tell it to look for something. As they have just released the feature, theyā€™re gonna be paying extra attention to any complaints that come in about it. If only one of us is reporting it, obviously theyā€™re not gonna take it too seriously as itā€™s only one out of many. But also for the love of God, please be respectful when doing itā€¦

By Oliver on Friday, December 20, 2024 - 01:44

Your point about reporting the issue is a good one. I would question your sign off though, it's verging on patronising. We are individuals with our own voices, so we may report as we wish, but also, this assumes that people will be rude or aggressive, which is problematic. Firstly, those who are rude and aggressive will not listen to your suggestion for civility, and those who will be civil don't need to be told.

I understand your intention, there have been enough bin fires here to show that, when frustrated, there is often an outpouring of anger but, and I think this is what may have made me twitch a little, I, nor anyone else, is a representative of blindness. I speak with my own voice on my own behalf.

sorry to pick up on it so strongly. Obviously there was something under the surface I have been thinking about of late that it raised.

But yes, reporting the trigger issue is a great call. Thank you.

By Oliver on Friday, December 20, 2024 - 01:44

Just to clarify the reporting process, so you don't do what I did and search about in settings for a means of reporting... After you finish your session hit the thumbs down and then you can say what the issue is.

By Gokul on Friday, December 20, 2024 - 01:44

The quality of the descriptions/answers (to be more specific) defers with the voice one is using? That's interesting. In which case, which voice gives better (subjective of course) descriptions?

By Stephen on Friday, December 20, 2024 - 01:44

I got an email back from open ai within hours of sending an email thru the settings then help center and filling out the contact form:

Jay
Hello,
ā€‹ļ»æ
ļ»æThank you for reaching out to OpenAI Support.

Thank you for sharing your feedback with us! Weā€™re thrilled to hear that youā€™re enjoying the new video mode for advanced voiceā€”your appreciation truly means a lot. At the same time, weā€™re sorry to hear about the challenges youā€™re facing.

Regarding the issue where ChatGPT acknowledges the request but doesnā€™t notify you when something appears in the frame, we understand how this can challenging, especially when the model doesnā€™t follow through as expected.

Here are a few factors that could be contributing to the issue and steps you can try:
Model Limitations: While the advanced voice and video capabilities are powerful, they may not always perfectly detect or notify objects in real time. The model might require clearer prompts or frames for improved accuracy.
Prompt Specificity: Try providing more specific instructions or breaking down the task into smaller steps. For example, instead of asking ChatGPT to notify you when it sees your dog, you could first ask it to describe what it sees in the frame, followed by a request to notify you.
Lighting and Clarity: Ensure the video feed is clear and well-lit. Poor lighting, blurry visuals, or fast camera movements may make it difficult for the model to identify objects reliably.
If the problem continues, please provide additional details or examples of the prompts youā€™re using. This information will help us further investigate and work toward a solution.

We appreciate you patience and understanding in this matter.

Have a nice day!

Best,
ļ»æJay
ļ»æOpenAI Support
We're here to help
Reply directly to this email or through our Messenger
intercom

By Stephen on Friday, December 20, 2024 - 01:44

I have found Arbor to be the most reliable.

By Stephen on Friday, December 20, 2024 - 01:44

It seems to be a live video feedā€¦

By Oliver on Friday, December 20, 2024 - 01:44

that sounded like a very Chat GPT response, which, given the context, makes sense.

By Gokul on Friday, December 20, 2024 - 01:44

I do agree with Ollie here. This seems like a very generalized response without, how do I put it, 'much sincerity' to it? I mean, GPT itself says that it is not permitted to live monitoring whichever way I prompt it.
@Stephen did you mention specifically in your message that this is important to you since you're visually impaired?

By Oliver on Friday, December 20, 2024 - 01:44

Not that your feedback won't be logged. Most likely they use Chat GPT to collate responses or something similar. It would surprise me though if an AI company wrote out a response long hand.

I don't think this changes anything. We should still put forward such requests.

By Quinton Williams on Friday, December 20, 2024 - 01:44

I too did send a message about this.
Hopefully this can be improved upon in the future but even now I've found it to be quite useful for identifying and reading things.
I look forward to seeing what happens in the future, but I am surprised we haven't heard so much as a word from be my eyes at this point, since the CEO himself had mentioned he'd been using it for months.
Perhaps they need to wait until video is part of the real time API?

By Stephen on Friday, December 20, 2024 - 01:44

I mean that isnā€™t very generalized as they are answering a specific question but we shall see. No I didnā€™t tell them Iā€™m
Blind and there is a reason for that, Iā€™ve noticed with companies that the blind card always has the opposite effect. Responses like ā€œwe are sorry but this isnā€™t ment to help blind people navigate, etc., etc.ā€ honestly, not telling a company Iā€™m
Blind also got me a job at a company Iā€™ve been working at for 3 years now lol. Thatā€™s a fun story. I just donā€™t see why being blind matters, itā€™s not doing what I want it to do and that is what matters. I mean it is pretty damn good for what it is but that one fix would make it perfect. Gokul, you may want to try using a different voice as some voices work better with vision than others. Please see my comment above.

By Stephen on Friday, December 20, 2024 - 01:44

Also switch gpts over to o1 pro