Greetings everyone, I had an idea to integrate Be My Ai and Xbox consoles. My idea would use the capture button on the Xbox controller to capture the screen and send that to Be My Ai for processing. I ran the idea thru Chat GPT. I will attach the proposal; it came up with below this text. I am seeking feedback on whether or not this is a good idea, could something such as this benefit other blind and low vision gamers? The financials that the AI created are totally hypothetical and not based on any numbers that i found. 
Please provide any feedback, good or bad. … Be My AI Integration with Xbox Series X|S – Unified Proposal
Be My AI Integration with Xbox Series X|S – Unified Proposal
Executive Summary
This document proposes a collaborative initiative between Microsoft Xbox and Be My Eyes to integrate the Be My AI visual interpretation service directly into Xbox Series X|S consoles. The goal is to empower blind and low-vision gamers with on-demand screen content descriptions at the press of a button. By leveraging Be My AI (an AI-driven image description feature of the Be My Eyes app) within the console, users can capture any game screen or menu and receive a spoken or text description of what’s displayed. This integration would enable independent access to on-screen information, support Microsoft’s accessibility leadership, and demonstrate AI-driven innovation in mainstream consumer technology.
Problem Statement
Blind and low-vision Xbox users often encounter inaccessible visual content such as menus, game HUDs, and graphical elements that cannot be read by Xbox Narrator. They must rely on external devices, apps, or sighted assistance to interpret these elements, which breaks immersion and limits independence.
Proposed Solution
This proposal suggests allowing users to press the Xbox controller's Capture button to trigger a screenshot. The image is securely sent to Be My AI, which returns a description that can be read aloud by Narrator or sent to a mobile device. This removes the need for external cameras or assistance, and delivers fast, accurate descriptions directly on demand.
Technical Architecture and Workflow
1. User configures output preference: Narrator or mobile app.
2. Pressing the Capture button triggers screenshot.
3. Screenshot is securely sent to Be My AI servers.
4. Be My AI returns a natural language description.
5. Description is read aloud or displayed via chosen method.
6. The user continues gameplay informed of the visual content.
Accessibility and Usability Considerations
- Fully compatible with screen readers.
- Requires no additional hardware.
- Fast processing time suitable for real-time gaming.
- Privacy-respecting: screenshots sent only upon user request.
- Seamless integration with Xbox UI and Accessibility settings.
- Optional support for follow-up questions or multi-step interaction in future.
Pros and Cons Analysis
Be My Eyes Perspective
Pros:
- Greater visibility and brand recognition.
- Aligns with mission to empower visually impaired users.
- Scalable AI deployment in consumer tech.
- Potential for expansion to other platforms.
- Strengthens Microsoft partnership.
Cons:
- Infrastructure must scale to meet demand.
- Additional technical support may be needed.
- Privacy handling requires investment.
- Niche use case limits monetization options.
Microsoft Xbox Perspective
Pros:
- Reinforces Xbox as accessibility leader.
- Empowers blind gamers with real-time information.
- Demonstrates AI-driven innovation.
- Differentiates Xbox from competitors.
- Requires no new hardware.
Cons:
- Requires engineering resources and system updates.
- Adds complexity to privacy and compliance processes.
- Ongoing maintenance and support needed.
- AI may not always provide contextually perfect results.
Budget Estimates
- Be My Eyes side: $231,000 (AI infrastructure, support, scaling).
- Microsoft Xbox side: $302,500 (engineering, testing, integration).
- Total Estimated Project Cost: $533,500
Next Steps and Recommendations
1. Begin joint planning between Microsoft and Be My Eyes.
2. Develop prototype for image capture and API transmission.
3. Build out settings UI, Narrator integration, and privacy options.
4. Conduct internal testing and QA.
5. Launch pilot with Xbox Accessibility Insider League.
6. Iterate based on user feedback.
7. Plan full deployment and long-term support partnership.
Final Recommendation
Proceed with collaborative pilot to validate the integration in real-world use. This feature delivers meaningful accessibility benefits, positions Xbox as a leader in inclusive technology, and opens new opportunities for AI use in gaming. With strong community impact and manageable costs, this project aligns with both organizations’ missions and deserves full support.
By Orlando, 8 April, 2025
Forum
Assistive Technology
Comments
It’s a good idea
It’s a good idea. Deserves to be submitted. I would say that since Nintendo switch to also has the functionality, you could also send it to them.
I'm sorry but this wouldn't work for gaming.
You say that this will allow blind people to continue gaming with eas, the issue is that the player would have to press the button, listen to the description, perform the action, and repeat, it's time consuming and not very practicle.
Let's say I'm playing a shooting game, I get to the main menu, press the take picture button, it reads out the menu items and I some how manage to get into a match/the level.
I'm now needing to shoot a person before they shoot me, I press the button, read what the person looks like, their hair, skin colour, gun, and so on, by the time I've read that; they would have shot me multiple times and I would have died.
It's great that you want to help blind people but this is just a bit to slow.
In my opinion; if you want a great gaming system; i'd go with the ps5.
If you want to know more; let me know.
I don’t think it’s a bad idea
I don’t think it’s a bad idea at all. There are blind gamers. This one guy beat an entire legend of Zelda game through memorization. A game like that would work well by being able to take a screenshot and have the scene described to the user. Just because it doesn’t work in one scenario that you described, that doesn’t mean that it’s not worth taking a chance on. And you’re also thinking two dimensionally. What about live AI? That will be coming at some point. Especially with something like be my AI.
@Brad
Thanks for the feedback. For the record, I am totally blind myself and I struggle to make progress in video games. I know the solution I presented will not works for every type of game. I feel that the potential to make games more accessible for blind and low vision gamers can get so much better if we all have some input on implementation. I just did this as a thought experiment to see what Chat GPT would spit back at me. I have no training in programming or coding and I already have a full time job, so i could not take on this project myself. If anyone knows someone at Be My Eyes{Hint hint hint} Please pass it onto whomever can make progress on it.or at Microsoft.
Useful in some cases
It has its uses even if they are limited. It could potentially help menu navigation in Mortal Kombat or Street Fighter style games if they don't have native menu narration for example, or it might help in some situations where something ight be blocking your ability to progress in the game. What it will not do is make an entirely inaccessible game playable in most cases, nor will it let you get past a situation which requires realtime feedback.
I really wouldn't consider sending it to Nintendo, they're both stubborn and frequently behind the curve on general features including those for mainstream users. It may be worth considering for Microsoft and potentially Sony though. I could also see a Windows tool that uses either a keyboard shortcut or controller share button to take a game screenshot, send it to be my eyes and get a description with a single press. It's a little specific in its application but it could be a useful tool to have in addition to native accessibility features or mods.
Love the concept, how about a slight alternative to the output?
First and foremost, I am a gamer. I am also blind, and I do have an Xbox. Primarily, I play fighting games. Mortal Kombat more than anything else, but I do have the latest Soul Calibur and Killer Instinct games.
My proposition to your proposal; everything up to pressing the capture button, to be done on the Xbox. Yet, how about getting the output on the mobile Xbox app for your smart device, iOS, Android, whatever is clever.
This would somewhat alleviate Brad‘s concern, whereas you could immediately go into a game, without having to worry about the Be My AI UI preventing you from engaging with the game itself. This can also be coded, so that each description is sent to a type of history, so it could be reviewed again at a later time.
Thoughts?
@Brian
A good concept but it'd have to be optional for people who don't have a smart device or don't want to use it in that way. The ability to use it purely on the XBox itself is probably important to have.
It's unlikely we'd be using a feature like this in a real time combat situation so getting tangled in menus is less of a concern, so long as there's a way to back entirely out in case the game's still running and we get attacked while messing around with it. That could easily enough be achieved by having the share button also exit the AI recognition menu.
@Orlando.
I thought so, chat gpt can be very fun for stuff like this.
You’re on the right track but with the wrong app.
As a gamer myself, I can tell you that using be my eyes for something like this is not very sufficient. It takes a lot of time to be able to push the button, hear the response, then act, especially in games that require quick actions. That being said, you are on the right track however, I think the screen share with ChatGPT would work so much better. I actually use the ChatGPT screen share feature on their video call feature to play Pokémon go and describe where things are on the screen on iPhone and it works surprisingly well.