Introducing Envision AI, a new iOS app to help the blind identify text, objects, and what's around them

By Karthik Mahadevan, 20 October, 2017

Forum
iOS and iPadOS

Hi,

My name is Karthik and I have been working with visually impaired users in the Netherlands for the past year to understand ways of enabling independence. In that process, I did a deep dive into artificial intelligence and how it can be a helpful tool for processing and conveying visual information. I found that most of what was in the market were simple object recognition apps that were not very practical to use.

Hence, we built an app called Envision AI, that takes a context-based approach to this problem. Our app can be currently used to:

- Recognise and read texts in their native dialect.
- Explain scenes that camera captures in detail.
- Train and recognise faces of your friends and family.
- Train and recognise your personal objects like wallet, keys or glasses.
- Do context-based recognition, that is, taking a picture of a clock will tell you the time, taking a picture of a window will tell you the weather outside, etc.

The app is still a beta, but available worldwide. We are constantly trying to understand what features are the most important and valuable and how we can continually improve them. So we would really love it if the community here can give the app a try and provide us with active feedback on what they think of it and how it can be improved.

P.S. We understand that we will immediately be compared to Seeing AI and other apps out there. What we can assure you is we are really committed to listening to the community here and work with them to build an app that really really helps them. We are working on this full-time and will be here every day to talk to you.

APP LINK: http://goo.gl/fptaYQ

Thanks and regards,
Karthik

Options

Comments

By jane suh on Thursday, November 9, 2017 - 04:19

I’m looking for an app that recognizes handwriting. It’ll be great if this feature was implemented.

Hi Jane,

Recognising handwritten text is a very difficult technical problem to crack that several people are working on but yet to achieve. The text recognition feature in our app does a pretty good job of recognising handwritten block texts, you can give it a try. We are however researching more on how we can improve this.

By Feliciano Godoy on Thursday, November 9, 2017 - 04:19

Hi,
I have used the app to take a couple of pictures. So far, everything good. Anyway to implement the take picture action by pressing the Up volume button? This would be a great implementation. More to come.
Regards,Feliciano For tech tips and updates, LIKE www.facebook.com/theblindman12v Follow www.twitter.com/theblindman12v

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

In reply to by Feliciano Godoy

Hi Feliciano,

Thanks for that feedback. We did use the volume button for pictures in one of our earlier versions but scaled it back as we ran into some bugs. We can revisit that feature and try to implement it again if that makes it easier to use the app.

Cheers,
Karthik

By Chris Smart on Thursday, November 9, 2017 - 04:19

Thanks for bringing this app to market. Personally, I'm most interested in recognizing bar codes on products, linking to any info on said product online (like cooking directions) etc.

By Chris Smart on Thursday, November 9, 2017 - 04:19

voice guidance to help aim the camera would be great as well. if the app could detect the edges of whatever object has focus, and give directions for positioning before a picture is taken, that would be most helpful.

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

Hi Chris,

That happens to be exactly the feature we are working on! We will definitely push out an update with a beta version of that soon. We believe such a feature can also be greatly useful for sighted people.

Thanks,
Karthik

By Geetha on Thursday, November 9, 2017 - 04:19

As someone living in the UK and still having no access to the Seeing AI app, your initiative is very welcome. I just installed the app and tried reading text on my computer screen, and it did a decent job! Are the results dependent on the amount of light in the room? I took a picture standing by a window, but it did not seem to recognise the window.

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

Hi Geetha,

Thank you for your feedback. Can you tell me what exactly did it say? We have programmed it such that when it recognises a window it tells you the current weather. Though it is possible that when it is dark our AI does not recognise the window properly.

By Krister Ekstrom on Thursday, November 9, 2017 - 04:19

Hi,
Can this app be used to read displays in realtime? I mean displays on things like coffee machines, washing machines, digital recorders, keyboards for music etc? If it can do that and do it better than Talking Goggles, then i'm all in.
/Krister

By MJ on Thursday, November 9, 2017 - 04:19

What is the gesture to stop speaking in large blocks of text?Seems to keep going until all text is spoken.

By JeffB on Thursday, November 9, 2017 - 04:19

It be nice if it used the Voiceover voice instead of Samantha I have my default voice on my phone set to Daniel and it is still using her. I really can't stand her voice. Also it be nice if it was in real time. This is 1 advantage that I think Seeing AI has. Another idea that would be nice would be a way to recognize currency.

By steven carey on Thursday, November 9, 2017 - 04:19

Hi Karthik,
I've just downloaded this app and I'm very impressed. As a UK resident I cannot access the Seeing AI app and felt left out of the discussions that came out of the app being launched in July. However, I can now see what all the fuss was about and I'm beginning to think that the Seeing AI app developer missed a trick by not launching his app worldwide as some of us might not bother now there is an alternative.
I've now had a go at using it and have a couple of suggestions as follows:

1. My iPhone 6 was on 100 percent when I started using the Envision AI app. I've taken six photographs and am now on 88 percent. Therefore, I guess that the app requires a lot of processing power. Can you do anything about reducing the battery drain?

2. I'm not sure what others think but the random citing of buttons is a little confusing and I find them a little difficult to find. Could you not cite the buttons at the bottom of the screen to standardise the Apple layout?

3. However, I do like the press anywhere in the middle of the screen, that could stay where it is.

4. Like another uer, I took a photograph of an A4 sheet of text and I could not stop the voice talking. The app would benefit from a button to stop the voice speaking.

5. Like another user, I would like to see the addition of a barcode reader that can identify where the barcode is, read it and give an accurate description of what I am holding, including things like cooking instructions etc.

6. I would also like to see a currency button on the app. As far as I kno, there is no app that can recognise the new UK plastic £5 and £10 notes.

7. I'm not sure if you use the flash when taking pictures in low light areas. If you do, is this automatic? If not, could you design a button to turn the flash on and off?

8. What is the significance of the survey at the start. I cannot undertand why you need statistics about vision loss and age of user. Does the app set personal settings for users like me who is totally blind and 53 years old?

8. I am aware that Envision AI is in beta at the moment but how do you envisage the charging structure to be in the future? I would consider a one-off sensible charge for the app to be appropriate. Personally, I will not use an app that is expensive as I cannot afford such apps and those apps that charge on a monthly basis or by camera shot are impracticle.

I'll carry on testing but I think this app is great and can only get better.

Steve.

By steven carey on Thursday, November 9, 2017 - 04:19

Sorry Karthik,
I forgot one point. Like another user I would either like to be able to change the voice to a high quality UK male or female voice or use the standard VO Daniel voice. Is this possible? Iguess most people would appreciate the ability to change voices to their natural language/dialect as it's easier on the ears to use a familiar voice.

Steve.

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

In reply to by Krister Ekstrom

Hi Krister,

It can read the displays, but not in real time yet. We are working on a build right now that will have much better text recognition functionalities including that real-time text recognition. Give us about two weeks and we will be out with that update.

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

In reply to by MJ

Hi MJ,

Sorry, we haven't built in a stop and start feature in the text recognition yet. But it is coming! Give us a couple of weeks with it and we will have a much better text recognition experience.

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

In reply to by JeffB

Hi Jeff,

Thanks for all your feedback. The reason we chose Samantha was because in the Netherlands most people have a Dutch voiceover and that was doing a horrible job of reading English phrases. But as we see that many of you are demanding the high-quality voices, we will definitely fix this on priority. Expect an update with it soon.

We are also working on improving our text recognition experience by including real-time reading and document scanning features. We estimate that feature to be ready in a couple of weeks.

Currency recognition is a great suggestion too and we will add that to our pipeline.

Thanks for your feedback and you're welcome to provide more!

Cheers,
Karthik

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

In reply to by steven carey

Hi Steve,

I really appreciate the time you took out to provide us with a detailed feedback. We love it. I'll respond to all your points as follows:

1. I am really surprised with the battery drain as we have really gone to lengths to test and ensure less processing to happen on the phone. Can you maybe tell us how old your phone is?

2. Thank you for the feedback on the UI and yes I believe we should move all the buttons to the bottom as that is more standard. I was wondering are finding it difficult to find the on voiceover?

3. Yes, we will keep the tapping the middle of the screen and maybe also add using the volume button as one of the users suggested.

4. Yes, we are building a much better text recognition feature that will do better job of document reading with stop and start feature included.

5. and 6. Yes, barcode and currency recognition are in our pipeline. We will work on them once we improve the text recognition feature.

7. We do have the automatic flash that would activate in low light.

8. To be honest, we do not what the pricing structure would be yet. Once we have built something that truly delivers value, we will experiment with different models. Thank you for your feedback on that. We will definitely keep it in mind when we reach that stage.

9. With the voiceover, we forced the text to be read in standard US voice as it was getting confused with the Dutch Voiceover that people here in the Netherlands. But now that we realise how important this is, we are going to ensure that our next build fixes this issue and allows for high-quality voice chosen by the user.

Thank you again, Steve. We hope you continue to use the app and shoot away feedbacks towards us as and when you find them. Especially when we start pushing updates almost on a weekly basis.

Cheers,
Karthik

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

In reply to by steven carey

I realised I missed answering the question about the survey. For now, it's purpose is only for us to know what kind of users are using the app the most. We might use it as one of the criteria while setting the priority order for features to build.

By steven carey on Thursday, November 9, 2017 - 04:19

Hi again Karthik,
I have an iPhone 6 which is coming-up to its second birthday. I know the battery is coming to the end of its useful life but I do use other battery hungry apps on a regular basis and I don't have much of a problem with them. For example, I use Audible for about 5-hours a day and I only get battery losses of about 25 percent or so and Goggles is very similar. I use my phone all day on a regular basis and still have over 20 percent left by the time I go to bed. I'll keep an eye on the battery situation and get back to you with some further comments later.

I might have got my numbering system wrong in my previous message but you did not answer my question on collectingpersonal information about sight loss and age, could you come back to me with an answer to that one.

Also, so that we have an app that does everything so as not to keep switching out, could you add a colour reader? I do use them but most are inaccurate at best and confusing at worse.

A great measure of how good an AI is, is to take a picture of an animal if you can get it to sit for long enough. Fortunately, my poodle loves a photo session and I managed to take a picture of him siting in his bed. Like other AI apps I've tried, yours said 'ct sitting in a basket' which was brilliant apart from the poodle being a cat. He is silver with a long tail and long floopy ears but definitely not a cat. Do you think that the app will get good enough to distinguish dogs from cats in the future?

Steve.

By Dave Matters on Thursday, November 9, 2017 - 04:19

I used the in-app option to make some suggestions. After reading all the other comments, I want to re-iterate the suggestion for being able to read barcodes and have access to product information. Digit-Eyes type product info.

Also, how are everyones experiences with taking shots of scenes? Mine aren’t the best. It thinks my stand fan is a microwave on a table. I tried the coffee pot and it thought it was luggage sitting by my refrigerator. Lol.

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

Thanks, Steve. I'll double check the battery situation on my end as well and test it on more devices. I realise that it is important.

Yes, I missed it the first time but then answered it again. I'll just paste the answer here again: "For now, it's purpose is only for us to know what kind of users are using the app the most. We might use it as one of the criteria while setting the priority order for features to build."

Yes, adding a colour reader should definitely be possible. However, can you tell me what scenarios or situations you use it in. This would help me understand how best I can incorporate it with features we have.

Your poodle sounds adorable and I am sorry our AI thought it was a cat. The AI works by training itself on millions of labelled images. including those cats and dogs. So the AI can only be as good as the training image it is fed. Hence, as the availability of larger datasets of images is increasing, it can be said without a doubt the AI will get better at differentiating between cats and dogs. That is what is most interesting about this technology right now.

By steven carey on Thursday, November 9, 2017 - 04:19

Hi again Karthik,
The situation with my poodle was quite interesting considering the app thought he was a cat but got basket right. The amazing thing was that his bed is blue material but shaped like a dog bed. So, I would have tought that was more difficult for an AI to recognise than a poodle. That's why I thought it was brilliant.

In terms of a colour reader. I use such an app to match colours for clothing. Us blind people used to utilize small buttons of different shapes to distinguish between coloured clothing but colour recognition apps has meant these are mostly not used anymore. However, most colour reading apps are not accurate as different light conditions usually change colours slightly, mosty if clothing includes man-made fibres. So, I would have thought that an AI could be used to compensate for such colour changes and light conditions to give a pretty accurate reading. In addition, if you could build graphics into the colour reader that would be even better. For example, if I wanted to wear my blue and white striped tee with blue cheno's I would need a colour reader that accurately says I have blue and white stripes!
Similarly, in the UK we use plastic bins on wheels and we have three: one for recycling which is yellow, green waste which is green and general rubbish which is a light grey. I'm always putting the wrong rubbish in the wrong bin and I get told off by my wife. A colour reader would do a brilliant job if it could accurately distinguish between the three colours. I don't use my current colour reader on the rubbish bins anymore because all three colours are so similar that the colour reader finds it difficult to distinguish between them, especially if the sun is directly shining on them. Another situation where a colour reader could be useful and I'm sure you can think of many more.

Steve.

Hi Dave,

Thanks, we just went through your in-app feedbacks. We will definitely work on the barcode reading and try to provide a good product information experience along with it.

The scene captioning is very experimental so do take it with a pinch of salt. At times it astonishes you with the accuracy of detail but can also say the most absurd thing at times. However, the feature has gotten considerably more accurate over the past year and we are really hoping for more improvements on that front in the near future.

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

In reply to by steven carey

Thanks again, Steve. I understand the colour requirement better now. Will add that to our pipeline and also put more thought over how best to implement it.

That said, our current scene captioning often does a good job of recognising colour of shirts and trousers. I have had it say "Looks like a man wearing blue and white striped shirt" and "Looks like a man wearing black pants and blue shoes". So I would encourage you to try that a bit and let me know how it goes.

By AnonyMouse on Thursday, November 9, 2017 - 04:19

Member of the AppleVis Editorial Team

As those that have already mentioned this is great to see that this is available for everyone worldwide! So, thank you!

Many of the the things that could be improved have been mostly addressed by others on this post. Those would be the barcode scanning of a product and having a currency reader would be useful for us.

There are some things that I am trying to grasp in how I could use this in an everyday situation. As much like the SeeingAI that when we take a picture to have it recognized is nice but I'm not certain how that really helps me. For example, it mentioned it can be trained to tell you that is a wallet. I'm not sure how that helps me when I know it's a wallet. I can pick that up and tell it's a wallet. Yes, I know that is a cup or that is a laptop. I suppose if I was stumped on an item that somebody gives me I think this could help me. Some of the pictures that I've tried were right on the money and others well... funny. One example, is that I took a picture of my dog to only to be identified as a dog in a costume. Another when I took a picture out of my window to say it's a dog on a bench. Although, I admit I was surprised that when at one time it noticed that it was a sunny day and gave me my forecast. That would explain why the Location permission was asked. It would be nice to let me identify various food items without having to open it or mark each one of them with braille or some other method to identify food items. So, another word making this a tool that truly would help us and not just that is a wallet if that make sense? If that were to tell me that is a brown leather wallet. I may see some potential to this or that is a blue striped shirt than a simple it's a shirt.

The section that has potential is the text reading. As of right now we must take a simple picture of something and it will indeed read the text to me. It does a decent job but like others have mentioned. It would be great to silent it when I want to. Otherwise, if I get a bad shot I am left listening to ramble on with garble text for some time depending how much I was trying to identify. The mention of a real-time text reading will be interesting as the SeeingAI is alright but you must be real patience with it and making sure you have it in the right spot. It least gives me an idea of what it is and that will indeed be helpful.

The other potential is being able to train the app to recognize different facial recognition. I did go through the process of training the app with 3 to 4 pictures of myself from various range and then label it as myself. However, I'm not sure how or where it can I use this to recognize me? This has some potential if I want to identify people with in pictures. That would be very helpful but not just somebody sitting across from me and I want to identify. That is just silly and not realistic in something we would do as I know that is my wife or my son over there. But, to recognize that is my wife and one of my son with in a picture would be useful. As many descriptions these days only tell me that there are two people smiling in the picture on a sunny day. That is not helpful.

Lastly, you have mentioned at some point this will no longer be a free thing to use. Were you thinking this may be a monthly subscription or a set of shots would be charged in some amount of money? As I understand you must monetize somehow to recoup your cost for developing the software.

My only concern and to be very realistic. If this becomes a paid service it will have to do a lot more than what it has now to be used to be worth every penny. As there are a lot of alternatives out there that it is getting complicated and hard to compete for the money we fork out. As we have BeSpecular and BeMyEyes that can have real people help us in situation or identify things by humans that an AI could never do. As you have also mentioned that SeeingAI is indeed an alternative that is very similar to your app and is a fantastic app for being free and will always be free. Lastly, the SeeingAI and Prizmo Go can offer fantastic not only OCR recognition but offs a pretty nice real-time text reading. The one thing you have going is that this is worldwide and others are not.

This is not to criticize you in any way but merely to help you realize and understand where we are coming from to maybe help you in being able to be successful and something that could be a powerful tool that we could use for everyday situation. Also, I have seen in the past where developers have used as a bait to help them full develop an app to later charge for it. To some it is like they helped you so much to where you are to suddenly turn around to say thanks and now here comes the part we will now charge you for using this. This is a double edge sword as I know you are here to help us in having something that can be so fantastic to use and we are willing to pay for but for some... Well, there are some that will balk. This is more to let you know and to forewarn you of our community. ;)

Thanks again for making this wonderful app and I hope that this will only get better and more useful for us when it becomes more mature and where you will start to charge us to use it. I can't wait to see where this goes and I wish you all the luck and success behind this app! Thanks for reaching out and letting us know about the app and most of all to come here and reach out to the users and all their questions and suggestions to make this a dynamite of an app.

Take care!

By Ka Yat Li on Thursday, November 9, 2017 - 04:19

Hi:
I just grabbed the app today and so far, I find the text recognition useful. I had it scan a receipt and it did better than other OCR and AI apps. I noticed though that even though it got more text, it's not as forgiving as KNFB reader or Seeing AI. With those apps especially Seeing AI, I can point to objects and documents at an angle and still get some useful text. With envision, it either gives a little bit of gibberish or nothing at all. Not only would it be helpful to have instructions on where to point the camera, I would like to see it do better at adjusting for angles and tilts.

One thing that would be innovative if you can implement it is the ability to detect highlighted text or some sort of colour change when using text recognition. I would like to be able to using Envision to read the bios and know which option is highlighted. Highlighted text could be spoken in a higher pitch. Currently, no other apps do this so I think if this can be implemented it would be innovative.
Thanks and glad you are open to feedback.

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

In reply to by AnonyMouse

Hi,

Thanks a lot for that very detailed feedback. We truly appreciate it and all of it is very well taken. In response to some of the points you had mentioned:

- You are right, simple object recognition like "this is a wallet" or "this is a chair" in itself is not useful. We thought deeply about it and realised that one way it can be valuable is if we attach a layer of context to it and instead of describing the object we describe the information the user might be looking for. This is the reason why taking a picture of a watch tell you the time and of a window tells you the weather outside. We want to extend this to other objects as well like: taking pictures of a bus should read the number and destination of the bus.

- What we have also tried to do is do descriptive scene captioning instead of object recognition. So instead of saying "it's a wallet" or "it's a table" it can, in fact, say "looks like a brown wallet sitting on a wooden table". This, however, is not very accurate at the moment and has the tendency to say really absurd things at times, but it should improve in the coming months.

Also, with custom object training in our app, you can train these objects with your unique label. So it can, in fact, say "Look's like John Doe's wallet" or "Looks like your wife's coffee mug".

- Another very experimental feature that we are testing that might make custom object recognition valuable is a "Lost and Found feature". So in that, if you have a pre-trained object like your wallet, you can ask the app to look for it. You can then continue to hover the camera on your phone to scan a surface, and when it detects your object in the frame it would beep with a frequency depending on your proximity to the object. We built an early version of it and seems to work fairly well, but we need to test how it can be deployed.

- Like you mentioned, our current focus is on really improving the text reading experience on the app. We really want to combine the best features across all OCR services and provide a really seamless experience for that. You can expect an update from us on that in a couple of weeks.

- I like your feedback on the face recognition. We have in fact implemented it in a way that it combines face recognition with scene captioning, just like you mentioned. So after you have successfully trained a person's face (we recommend about 5 to 10 pictures from different angles and different background) try taking a picture of the person. You will see that instead of just saying the person's name it can say things like "Looks like Jennifer is working on a laptop computer". We believe this can also be extrapolated to pictures you already have. We do not yet have the option to upload your own picture for captioning, but it has now been added to our pipeline of tasks to do.

- We honestly do not know what will our revenue model be for this in the future. Like you said we do not to make money to cover our expenses at some point, but I believe that might not necessarily have to be paid by you. We will explore different pricing models and sources in the future, but for now, the focus is on building something truly valuable. As you pointed out, this app has a long way to go before it reaches that, hence, for now, the complete focus is to engage with the community and be very agile in delivering features they are looking for.

- We truly appreciate all your feedback and really do hope we can use them to build a truly powerful and valuable tool. We are loving the way this community is responding to us. We will be mindful of remembering and honoring that as we grow.

Cheers,
Karthik

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

In reply to by Ka Yat Li

Thank you so much for that feedback. Yes, we are working on really improving the text reading experience on our app. We know our OCR is good and if we build a nice experience around it both for reading short texts and long document it can have a great implementation. Expect an update from us on that in a couple of weeks for sure.

About the highlighted text thing, we will really have to look into how to train my OCR to get that. I cannot promise at this momen if that would be possible, but we will definteley investigate.

By steven carey on Thursday, November 9, 2017 - 04:19

Hi Karthik,
I've been reading some of the other comments and I agree that a live camera feed to read digital information would be really useful. One example is that I use an exercise cycle that has a digital panel giving speed, calories burnt, time programme has been running, pulse rate etc. As far as I know, most exercise cycles do this but none that I've come across give audio output for this key information. If you could somehow produce something using such a live camera stream that could read this information and give feedback periodically (perhaps adjustable in a setting mode) that would be useful. If you think about it, there are many other items that now have digital panels giving information that this app could support. For example, weighing scales, washing machines, digital radios, thermometers, the new Echo show and many, many more. Just think about the future where one could connect an iPhone directly to such items to read panels or indeed, wear a pair of smart glasses connected directly to the Envision AI app.... sorry, getting into science fiction now.

Steve.

Steve.

By Lee on Thursday, November 9, 2017 - 04:19

Hi Karthik, downloaded this and have been playing around with it. I like it and look forward to the advances coming. In terms of power drain mentioned in these comments. On an iPhone 6s with IOS 11.0.3 at 100% I took 12 photos. After that the power was at 100% still so no drain from the phone.

By cool cat on Thursday, November 9, 2017 - 04:19

Hi. Good post AnonyMouse . For this app to beappreciated it has to seperate itself and or do something new or better. The lost and found idea has some potential. It's really cool that you come here and ask for feedback Karthik. As for the battery drainage. I have a Se and on 10.3.3. I took 10 pictures and the battery went down 5%.

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

In reply to by steven carey

Hi Steve,

Yes, understanding the text and readings on different home appliances and being able to operate them is something that we are looking at. We tested our live-text OCR and it was reading text on certain appliances pretty well. We will need to focus on this problem more whole-heartedly before we can come up with a viable solution.

Haha, you will actually be surprised how far from science fiction and closer to reality smart glasses are becoming!

Hi Lee, thanks for the feedback. Yes, please look forward to advances we make in the coming weeks.

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

In reply to by cool cat

Thanks for that feedback! Yes, we have roadmaps for developing on both different and better quadrants. We believe the more we engage with the community the more features like lost and found we will be able to discover and develop. I have been loving all the constructive feedback here.

Thanks for the update on the battery. We will continue to do our best to optimise processing drain at our end.

Cheers,
Karthik

By Louise on Thursday, November 9, 2017 - 04:19

Great concept, but it takes a long time to read text. Real time text recognition is a must for me with these sorts of apps.
I am comparing to Seeing AI to help you understand a user case. With that app, I can point my phone into the pantry, and scan the canned food as I move the phone around.
In a meeting at work, I can quickly access a meeting agenda without taking it's picture.

Another thing I found was that the app didn't mention that the weather was sunny when I pointed it at a window, it just said that there was a window.

I think you have a good idea, and I'll check the app again after the next set of updates, but it's not going to be my "go to" as it is.

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

In reply to by Louise

Thanks, Louise! Yes, we are currently working on a version that will do live text recognition much faster as you mentioned. We should be able to push out that update, so please do check back then.

The window thing usually works well. I will look into its confidence level again and see if it needs to be tinkered with to get it more often.

By DMNagel on Thursday, November 9, 2017 - 04:19

I am not sure if this has been mentioned, but I would like to be able to send photos from my own library to be identified.

By Janet Tuggey on Thursday, November 9, 2017 - 04:19

Hi

I was so excited to hear about this app. I downloaded and set it up, but when I try to take a photo it takes the picture, beeps a few times and then crashes. I'm using an iPhone 6. Feeling rather disappointed as I would dearly love to try this app out.

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

In reply to by DMNagel

Hi,

Yes, we are working on including option to upload own pictures

Hi Janet,

This is quite surprising and disappointing! I am using an iPhone 6 myself. Can you please give us a way to contact you by sending an email to [email protected] we will immediately reach out to try and understand why you are facing this crash issue. I am sure this would be something we would be able to fix remotely.

Thanks,
Karthik

By Karok on Thursday, November 9, 2017 - 04:19

hi, i am glad to hgotten this app and see that of course realtime ocr is possible. as i cannot officially as of yet get seeing ai and am unsure if it would work with uk-based barcodes for groceries, could a uk-based database be included to ensure groceries are recognised along with cooking and other instructions?i would be happy to help beta test such a release.

By JeffB on Thursday, November 9, 2017 - 04:19

Hi so you mentioned something about possibly pointing the camera at a bus and getting the bus number? I think this would be really great for bus terminals! Something that I think would be neat is if the app could recognize vehicles. When waiting for a bus I often hear a loud engine approaching thinking it could be my bus only to find out that it was a school bus or a truck. Then I feel silly standing there with my arm out. It would be neat if it could tell you what vehicle is approaching so you can be prepared. Even if it was just as simple as telling the difference between a bus and a truck for example and maybe the color. This could be helpful too when getting an Uber or Lyft. Often times we are told it’s a blue sedan or something but that really doesn’t help us. In more terms of navigating it be cool if it could pick up things like a trashcan blocking the path or warn you of a puddle up ahead. Bumps and ice would be even harder to pick up probably but letting you know if your path is clear would be neat too. Idling cars in a driveway? Knowing if there is a driver in the car or not would also be useful.

By steven carey on Thursday, November 9, 2017 - 04:19

Hi again Karthik,
The comment I made the other day regarding the use of smart glasses to operate the Invision AI app might not be such a bad idea.
If you look at the great comments from others there is a common theme expressed. Most people are interested in doing much more complicated things than just reading the time from a clock or checking what the weather is through a window.
We now see the need to read digital panels, check whether the right bus is coming along, checking people's faces for recognition, making sure you have the right taxi from the colour and of course reading text, whether that is a menu, foodstuffs at home, road signs or even newspapers. The one problem here as others have suggested is a social one. As blind people we don't want to draw attention to ourselves any more than we already do when we are using a long cane or when we trip up the kerb. So, having an iPhone up ahead of you so that the camera can read the bus number or check who is coming your way with the app recognition AI or holding your iPhone above your head to find the rice crispies on the top shelf in the supermarket looks really weird and of course draws lots of attention.
I know this is not the right time to be talking smart glasses as you still have a long way to go with the development of the Invision AI app but in the near future we will have to be thinking in terms of more futureistic methods of getting data into AI apps.

Steve.

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

In reply to by Karok

Hi Will,

Thanks for that feedback. Yes, we will actively start working on the barcode feature as soon as we have fixed/improved our current OCR. We will make sure that the system works globally, including in UK.

By Charlie on Thursday, November 9, 2017 - 04:19

Hi,

I live in Scotland, United Kingdom and a number of banks here issue their own Pounds Sterling notes. Other apps, which I have tried for OCR recognition, can identify Bank of England currency, however, do not recognise Scottish bank notes. Could you implement functionality, within your app, to permit us to "train" the app that it is looking at a £5, £10 or £20 note? This would help too, when banks change the design, or images, on their notes.

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

In reply to by JeffB

Yes, the bus context is indeed something we are trying to build. We believe it's important to give people useful information and not just information. Our current scene captioning can already do the car and it's colour up to an extent, but it is not reliable. We hope at some point we can open our platform to people like Uber and Lyft so they can make their products more accessible through it.

Obstacle detection is rather tricky as it requires real-time detection and information processing. We had some ideas on how to deal with it earlier but will probably not focus on it in the short term. However, keep throwing such ideas our way as we are constantly trying to gauge what would be cool and useful to build.

Cheers,
Karthik

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

In reply to by steven carey

Hi Steve,

Yes, we focused a lot on building this entire experience through a wearable camera initially and that is still what our end game is. We realise how unobtrusive accessing all this information through a wearable camera would be and the one we initially designed was very cool and non-stigmatising in its aesthetics. However, we soon realised the high cost that comes with prototyping and producing hardware at scale, hence we are now looking to focus and nail down on our software experience and use the insights gained from that to build a hardware product at a much later stage. We hope we are able to reach that stage sooner than expected as our primary focus is enabling great user experiences.

By Karthik Mahadevan on Thursday, November 9, 2017 - 04:19

In reply to by Charlie

Hi Charlie,

Thank you for that great feedback. After reading your comment I tried to train some banknotes here using the object recognition on our app to see if that can be used as a self-trained bank note recogniser. Unfortunately, it is not as reliable. It was able to identify it as a banknote every time but was confusing between the 10 and the 20 euro frequently. Hence, I feel a right way to approach it would be to include a dedicated bank note trainer which allows you specifically train banknotes. We will definitely work on this as this seems a much better way of approaching currency recognition feature than building a general classifier for all currencies in the world. We will do some experiments later this week on it and update you.

On a side note, we had a user here who was using the OCR feature on our app to detect their banknote. Do try if that works for you as well for now.

By steven carey on Thursday, November 9, 2017 - 04:19

In reply to by Karthik Mahadevan

Hi Again Karthik,
Iwas just wondering about the clock and window examples you gave for ai processing. I tested one of these, the clock idea, by taking a picture of my digital watch. I use a talking watch from the UK RNIB which has a digital watchface. Unfortunately, although the voice part of the watch is very accurate which if fine for me, something has gone wrong with the digital watchface. If sighted people see it they always say 'your watch says the wrong time' and my response is 'I know but it oesn't matter because I cannot see it, I use the voice'! However, when I took the picture, theapp gave me the correct time. Does this mean the Envision app doesn't actually use AI to work out what the timeis but works out that the picture is a watch and then goes to the iPhone time and readsthat? In the same way, when you take a icture of a window, do you get the weather from a weather app?
If this is the case, there might be a problem with the really inaccurate weather forecasts we get in the UK and what the weather really is.
For example, this morning I was going out so listened to the weather forecast on the BBC and checked the BBC weather app out which said it would be cloudy. However, when I wentout it was raining! Alas, I didn't use Envision but what would have happened, would I have got rain or cloudy?

Steve.