News

Google presents Image, its new AI for the creation of highly realistic images

Google has just introduced Image Video, a model based on artificial intelligence capable of creating highly realistic HD videos from words or short descriptions. With the aim, also, of surpassing the Make-A-Video of MetaGoogle’s AI generator is even capable of outputting 1280×768 HD video at 24fps.

Although Imagen Video is currently in an investigation phase, expectations are highly positive for the company. Not only is it slightly superior to the Meta option, but it has appeared in just five months which shows the speed and the importance that Video Image has for the company at the moment.

Main features of Google Image

To get it, Image uses the text-to-text transfer transformer (T5). It is a model presented in 2020, where at first it was based on unifying inputs and outputs of text strings. The model has already evolved and is capable of performing image synthesis.

Another of the peculiarities of Image Video is that it is capable of offering results with a more precise level of detail than what other text conversion systems do, such as DALL-E 2 and VQ-GAN+CLIP.

The technology used by this model can even scale images to 256 x 256 pixels and then 1024 x 1024 pixels, thus generating a cascade diffusion model. Unlike the AI ​​that originally produced 64 x 64 pixel images and then scaled them.

In the project, they also highlight other important points of the image research carried out to develop this AI. Like, for example, the development of new proprietary Efficient U-Net architecturewhich is more computationally and memory efficient.

The training data for Google Image Video comes from the image-text dataset LAION-400M publicly available and “14 million video-text pairs and 60 million image-text pairs,” according to Google.

Image Video has also been “trained” to filter sexually explicit and violent content, as well as social stereotypes and cultural biases. For this reason, Google has wanted to emphasize that, for the moment, this model will not be open source or globally accessible to avoid improper use by users. And that, in addition, there are also many aspects to improve.

Just a few weeks ago, the nonprofit AI research company OpenAI introduced DALL-E. It is a new AI system also capable of converting words into realistic images. As well as editing photographs based on a written request and removing elements such as shadows, reflections and textures.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *