Microsoft’s VASA-1 can generate realistic talking faces from just one image.

The post Microsoft’s VASA-1 can generate realistic talking faces from just one image. appeared on BitcoinEthereumNews.com. In a recent white paper, Microsoft introduced a new AI model that produces a talking head that looks and sounds realistic and is generated by only uploading a still photograph and a voice sample. The new model is named VASA-1, and it requires only one portrait style picture and an audio file of voice and fuses them together to make a short video of a talking head with facial expressions, lip syncing, and head movements. The produced head can even sing songs, and that in the voice uploaded at the time of creation. Microsoft VASA-1 is a breakthrough for animation According to Microsoft, the new AI model is still in the research phase, and there are still no plans to release it to the general public, and only Microsoft researchers have access to it. However, the company shared quite a few samples of the demonstrations, which show stunning realism and lip movements that seem to be too lifelike. Source: Microsoft. The demo shows people who look real, as if they were sitting in front of a camera and getting filmed. The movements of the heads are realistic and look quite natural, and the lip movement to match the audio is quite outstanding, provided that there seems very little to be noted anything for not being natural. The overall mouth synchronization is phenomenal. Microsoft said the model was developed to animate virtual characters, and it claimed that all the people shown in the demo are synthetic, as they said, the models were generated from DALL-E, which is the image generator of OpenAI. So we think if it can animate an AI generated model, then obviously there is much more potential in it to animate photos of any real person, which should be more realistic and much easier for it to handle.…

Apr 19, 2024 - 08:00
 0  2
Microsoft’s VASA-1 can generate realistic talking faces from just one image.

The post Microsoft’s VASA-1 can generate realistic talking faces from just one image. appeared on BitcoinEthereumNews.com.

In a recent white paper, Microsoft introduced a new AI model that produces a talking head that looks and sounds realistic and is generated by only uploading a still photograph and a voice sample. The new model is named VASA-1, and it requires only one portrait style picture and an audio file of voice and fuses them together to make a short video of a talking head with facial expressions, lip syncing, and head movements. The produced head can even sing songs, and that in the voice uploaded at the time of creation. Microsoft VASA-1 is a breakthrough for animation According to Microsoft, the new AI model is still in the research phase, and there are still no plans to release it to the general public, and only Microsoft researchers have access to it. However, the company shared quite a few samples of the demonstrations, which show stunning realism and lip movements that seem to be too lifelike. Source: Microsoft. The demo shows people who look real, as if they were sitting in front of a camera and getting filmed. The movements of the heads are realistic and look quite natural, and the lip movement to match the audio is quite outstanding, provided that there seems very little to be noted anything for not being natural. The overall mouth synchronization is phenomenal. Microsoft said the model was developed to animate virtual characters, and it claimed that all the people shown in the demo are synthetic, as they said, the models were generated from DALL-E, which is the image generator of OpenAI. So we think if it can animate an AI generated model, then obviously there is much more potential in it to animate photos of any real person, which should be more realistic and much easier for it to handle.…

What's Your Reaction?

like

dislike

love

funny

angry

sad

wow