Microsoft Reveals New AI Which Can Replicate Speech With Just a 3-Second Sample

Mike Sanders / 2 years ago

The speed at which new AI (artificial intelligence) is being developed is undoubtedly a matter of concern to many. Ultimately, it does feel that we’re going to have to start making some big decisions surrounding it. Specifically, to attempt to err its application more towards saving the human race rather than allowing it to eventually decide that things, in general, might run a little smoother without us pesky humans getting in the way.

So, take this as impressive or terrifying, but following a report via Arstecnica, Microsoft has just revealed a brand new AI system that is reportedly capable of accurately replicating a human voice sample based on just 3-seconds of audio sampling.

Microsoft Reveals New Voice Replication AI

The system is known as VALL-E and Microsoft has claimed that with just a 3-second audio sample it is able to generate fresh text-to-voice output that strongly replicates the voice of the original source. – In other words, it can sound like you, and more so, say things that you never actually did!

Rather than malevolent means, however, Microsoft hopes that this new AI technology will be capable of providing a better-sounding output for automated voice lines/commands. Specifically, an audio output which can be automatically tailored to specific regions (through some actual human references), and, by proxy, be much easier to understand in terms of localisation.

Admittedly though, it could also seemingly be used to get you to profess your love for Microsoft and even their Windows 11 operating system!

What do you think though? Does AI technology like this impress or scare you? – Let us know in the comments!