I don't know where to put this so I hope this is the right place.
I just typed in audio description into youtube and came across this.
I honestly don't like it, I feel like AI isn't there yet when it comes to describing things, also you can hear that the voice is AI if you listen carefully.
It's ok I guess for describing a nature doc or something but I feel it doesn't have that human touch so wouldn't work for a horror movie or comedy for example.
The idea is interesting, but I hope that people like game studios and so on don't pick this up because it's easy.
Anyway; here's there youtube link: https://www.youtube.com/@Visonic-AI-m2c and here's their website: https://visonicai.com/
What do you guys think?
Comments
They should use the Microsoft Neural voices
Those Microsoft voices are so good. But again, someone needs to supervise how the TTS pronounces names etc. It shouldn't sound weird.
Also, I doubt if this can be utilized for actual movies and shows. Adding AD in such content requires carefull attention to place AD exactly at the right place, in between dialogs. An AI can certainly analyse the dialogs and the gaps between dialogs, but doubt if this one is doing it yet.
The page says it can annalise the movie for silences.
I like AI for what it can do for us but then things like this come along.
I wonder how much practise they have at actually audio describing things? Do they have actual audio describers backing this product? I doubt it.
Their two demos show the exact same movie clip and one for the slightly longer one and i'm just not impressed at all.
Maybe i'm just old but I just can't see a world where something like this would be prefered over a human narrator.
Games
Honestly if it's a way to get more accessibility in games I'd be for it, even if we have to wean them off it later. A fair number of games are adding menu narration with no other accessibility features for us and if this gives them another simple drop in tool at least that's more than we would get otherwise.
re: games.
the issue there is that the persons voice can help connect the story together, for example in a fast paiced scene, the narrator might speed up slightly or sound a bit more excited or whatever you need for that scene, this just sounds flat.
Also, giving devs tools like this will just make them lazy.
Re:Games
Long term I agree but menu narration has already made them lazy, they add it in and think that's a box checked off the list. It's the people who manage the money we need to convince as much as anyone though, they look at cost benefit so if we can reduce the cost it'll be easier to make it start being a thing. The more games that add it the more likely other devs are to take it seriously and the more chance we have to prove that it's worth the investment.
Ultimately I see the quality of game audio description as secondary to making it exist at all, sure we're making a battle for ourselves further down the road but it's still probably an easier process than having one holy hell of a battle to get human audio description as a widespread feature without an intermediate step.
I think it is awesome!
If audio describing can be made this easy and cost effective to implement, we could go back and have the olden golden movies audio described. It surely will take away Voice Over jobs; it is going to hit hard on those artists. Honestly, did not mind the vocals/ voice of the audio description did not feel like it was done by AI but I ain't a trained listener.
Potential
So it's interesting as a long term step. However the demos are short and not comprehensive, they put together description sentences from some unknown scenes and there are no dialog, no action. But I would like to no if this aproach can be adapted for other description apps.. Seeing AI has already a video description function but as I know it pauses the video then describes it. I wonder if they can apply something like this to the function.