Generative AI: State of Play, Part 2
Try the “Machine Learning 101” learning lab to get a quick understanding of
machine learning terminology, applications, and linear and logistic regression.
In part 1 of this series, “Generative AI: State of Play, Part 1,” we examined some key topics regarding Generative AI. First, we looked at the landscape of applications available, including text and image creators. Then we followed that with a discussion of the need for humans to validate the output of these types of applications. This finally lead us to an evaluation of the limitations of the technology. Now, let’s continue our journey in Generative AI and talk about content rights, the fight for attention, and a little bit more about image generation.
Content rights
First let’s take a look at the licensing and user rights allotted for different types of applications.
“Midjourney” is a popular generative art application. Up to 25 images are free for new users. The original content is licensed under a Creative Commons NonCommercial 4.0 Attribution International License. This means you cannot use the generated images for commercial purposes. All rights to the images are transferred to the paying customers, meaning such images can be used commercially. There is one interesting restriction: if you use the services to benefit a company with an annual revenue of more than $1 million, you must use the corporate package.
In contrast, OpenAI indicates in the “Your Content” section of the Terms of Use, the following is stated: “…hereby assigns to you all its right, title and interest in and to output” The rights to the content belong to the user even if you use the free credit for new users.
The fight for attention goes on
Attention is what all content makers are fighting for, and once you get attention, you can share your content and engage and influence people. By attracting the attention of people with a specific profile, you can, among other things, promote relevant products, services, information, and ideas or hidden advertising. Nothing new.
Currently, content delivery could be improved. Generative AI cannot distribute its content on platforms and social networks on its own. This is still done by people who have an audience. As a rule, people share content through their channels/pages or on the same platforms where they create content.
Photo stocks have also responded to the significant attention from creators to Generative AI. Getty Images and Shutterstock have updated their rules and do not accept AI-generated images. Different platforms and sections are being created for generated content (Shutterstock Generate).
I believe that people will create automated artists and creators who, depending on trends and external factors, will create relevant content and publish it on the appropriate platforms. Luo Tianyi is one of the examples of digital artists/creators in the entertainment sector that has attracted people’s attention. Her images and style are based in computer graphics and some of her content is generated. As she is quite well known, it indicates that there will likely be no issue with the popularity and attention to this content.
Image generation
Image generation was one of the early applications of this technology and the public was quick to adopt it using these tools. Companies and projects include Stability-AI/Stable diffusion (open-source), Midjourney, OpenAI/DALL-E, and Google/Muse.
Below are two generated images for comparison. Description: “CHRISTMAS SLEDGES ON SNOW WITH PRESENTS”:
OpenAI/DALL-E
Midjourney
The more detailed the description, the better the image. An example of an image that resembles a professional photo.
The most interesting aspect of the last set of photos is how detailed the requirements are to generate something like that and the depth of knowledge necessary by both the trained model and the prompter need to be. The description for it looks like this: “PIXER ANIMATION design, CHRISTIAN SLIDES IN THE SNOW WITH GIFTS, epic beautiful scene, cinematic, post production, depth of field, cinematic photography, cinema, color correction, professional color correction, 55mm lens, exquisite detail, sharp focus, fine detail, long exposure time, f/8, ISO 100, shutter speed 1/125, diffused backlight, award-winning photography, realistic photography, hyper-realistic, unrealistic engine, realistic lens flare, realistic lighting, lettering, hyper-realistic, 8k, detail, photography, cinematic lighting, studio lighting, beautiful lighting, Accent lighting, global lighting, global screen space lighting, ray tracing, global lighting, optics, scattering, glow, shadows, roughness, shimmer, ray tracing, ray reflections, ray reflections, lumen reflections, screen space reflections, diffraction gradation, GB offset, scan lines, ray tracing, Ray Tracing Ambient Occlusion, Anti-Aliasing, FKAA, TXAA, RTX, SSAO, Shaders, OpenGL shaders, GLSL shaders, Post Processing, Post-Production, Cel Shading, Tone Mapping, CGI, VFX, SFX, insanely detailed and complex, hyper-maximalistic, elegant, hyper-realistic, super-detailed -v 4. ”
There are also models and applications that can combine different images, edit images, apply masks, and generate a new image based on prompts as you can see in the examples below.
Custom diffusion
Muse
Dreambooth
Video generation
Have you ever seen a video of the man in the screen shot below?
I have seen many videos with him on different topics, with different voices, and different accents. This reminded me of the period when you could see the same Joomla templates being used on different websites on the same day.
On the one hand, the availability of such companies and tools will make video production much cheaper. At the same time, it makes the content created by people, with the participation of people, and for people more unique. And I am sure that over time that content created by people will be valued and worth more than AI-generated content. For example, if you want to watch a video with real people recorded using takes, scripts, and everything else, you will have to pay more.
Beyond Content
There are several other interesting projects using this kind of technology. One is called Galactica, which works with scientific articles and searches for citations. As far as the interests of Cisco related topics, we’re starting to see network engineers and infrastructure automation developers also use large language models (LLMs) to accelerate their work and troubleshoot, cover error handling, config file validations, etc. You can find an example of such a project here.
Try the “Machine Learning 101” learning lab to get a quick understanding of machine learning terminology, applications of machine learning, and linear and logistic regression.
Share: