Google demonstrated how it can generate audio-visual content from simple sentences using Artificial Intelligence (AI) technology. The company detailed Imagen Video, a system trained with 14 million video-text pairs and 60 million image-text pairs with its own video.

The company celebrates success and highlights its differences. “Imagen Video has some degree of control and knowledge of the world, including the ability to produce high-definition audiovisual as well as a variety of videos and text animations in a variety of artistic styles,” says an article published by the company.

(Source: Google/Disclosure)

According to Big Tech, the technology has demonstrated the ability to understand depth of field and three-dimensionality. This makes it possible to create videos that simulate a drone flight, rotating and capturing objects without distorting your view.

How does Imagen Video work?

(Source: Google/Disclosure)

Imagen Video produces low resolution files to increase definition using algorithms. The application creates a 16-frame clip with a description of the image at three frames per second and a very low resolution of 24 x 48 pixels.

It then completes and amplifies the frames, resulting in 128 frame files or 24 frames per second at 1280 x 760 for a total of five seconds. This time is not sufficient for large-scale use for commercial or educational purposes.

Like Make-A-Video by Meta, Imagen Video distorts some frames of the video as well as artificially physically mixing things up. The research team plans to combine their work with Phenaki, a text system recently launched by Google that can render clips longer than two minutes in lower quality.

Source: Tec Mundo

Previous articleUS authorities ban the use of Huawei and ZTE equipment for cellular communications
Next articleGarmin MARQ watch gets new versions with advanced features

LEAVE A REPLY

Please enter your comment!
Please enter your name here