AI-Generated Imaging – The Next Step in Computational Photography – Medium | Ad On Picture

For creatives, hard work and effort goes into a great painting, an award-winning photograph, or a well-directed video. Everything begins in the mind of the creative. This talent makes a creative recognized in the industry in which they work in media, arts and entertainment. What if a computer could do the creative thinking and create the image as well?

This is now possible in the area AI (Artificial Intelligence) computer software. It’s possible to tell an app what you want to render and it will do it automatically without the user having to do anything else. Imagine just typing or saying what you want to create and the software will do it all for you. Not only would that disrupt the art and photography industries, it could also create a new form of imaging industry for creatives.

Other techniques include genetic breeding (under use Style GAN Techniques), random image generation and soon there will be Image editing in natural language.

Note: At this time, most of these techniques are still experimental or have limited applicability.

Frameworks that offer machine learning and computational imaging allow users to create their own images with an app. It’s so easy now that anyone who has the app installed can do it. An example is the DALL-E 2 system off Open AI. The algorithm allows users to speak or type using natural language, which the computer then interprets and converts into a rendered image.

Users simply enter a description of the image they want the software to produce. For example, a user can type or say:

The software using DALL-E 2 produced the following image (Note: This is not a man-made illustration) from this description:

(Source OpenAI)

The rendering was done by the software without additional human help. Open AI developed the system using a neural network, which is a type of process that uses a “trained” set of images that the software references. It then uses another AI technique called Natural Language Processing (NLP) to understand the description given by the user (typed or by voice) to generate the image (this uses the GPT-3 Library).

The amazing thing about it is the accuracy and speed of image generation. What makes it even better is that the user doesn’t have to do anything other than provide a description of what they want. If it becomes a mainstream app built into smartphones, anyone can create their own images by typing or saying it. This can be for fun and entertainment. A more serious use of this technology is to create commercial content (e.g. stock photos or images, memes, thumbnails, etc.).

A creator on YouTube could use the app to create a thumbnail for their video. Students can use it to create images to use in class reports or projects. To get more creative, content creators can use this feature (instead of stock photos sites) to create exactly what they have in mind. This can easily be done via the app installed on a smartphone. This is where the possibilities for many applications begin to open up.

Are you a photographer looking for models for a shoot? What if you could easily create your own mockup online? By the way, you can do that. The following faces in these photos are not real people and were computer generated.

None of the people in these photos are real people (Source: This Person Does Not Exist)

These images were generated online by This person doesn’t exist. Anyone can create a “wrong” person. The app uses a Generative Adversary Network (GAN) style-based image generator. From a training set with many faces from around the world (different face types, ethnicities, races, etc.), the software spontaneously generates a face. No parameters are required. The software creates a random face when the user clicks the “Generate” button.

This might be frightening if you fear it looks so real it can replace a real human. There is even virtual model Applications that can display not only the face but also the body (including hands, legs, torso, feet and other details). This means a product retailer can use virtual models in their next advertising campaign to save on costs by hiring a real model. Whether or not this actually becomes big business could depend on market preferences and the current state of the world.

Virtual Model Zoe (Source CM The Model Agency)

CM models has added a number of virtual models to her website. One of her virtual models is called Zoe. You can tell that Zoe isn’t a real person by the fact that “she” looks very computer-generated. The point here is that it can be used in virtual worlds as a model in the making metaverse. Zoe can also appear sporty versions of popular fashion brands NFTs (Non-Fungible Tokens). Zoe probably won’t be appearing on the runway during Fashion Week, but will be appearing at a virtual event in the Metaverse soon.

Why use virtual models instead of a real person? There are different reasons. One could be due to the lack of an available human model when an advertising campaign needs one. The campaign can temporarily replace a human with a virtual model. If there are more lockdowns due to health restrictions, producers and creative directors could turn more to it virtual photography (e.g. remote photo shoots) and virtual models. The app can even replace a real photographer and model, which can save time and money.

users of the app art grower can explore the applications of AI to create unique images of faces, art and characters. They use machine learning to generate stunning images. It’s not just people, it’s objects and pretty much anything beyond the imagination. The software generates images based on their distinctive traits or “genes” (similar to genetic engineering). The software can combine these “genes” to create new unique images.

A generated image (“Eve-0222”) with an ancestry map or the various “genes” that the image inherited from other users (source ArtBreeder).

This app is mainly used for fun. It is a kind of social app as users can use other people’s images to generate new images. Images are credited to their respective creators, and the app makes them available for other users to use. Users who generate a lot of images are offered a paid subscription with many additional benefits (based on tiers). So this is an example of a commercial AI app.

Users can also create characters with illustrations (source ArtBreeder)

Users can combine different images (which breeds like genetic information) when creating their own image. It can then be traced using an “ancestry map” that shows the “ancestry” of the created image. Users can create as many picture portraits and styles as they like while networking with other users. It is both creative and collaborative, allowing users to have an interactive experience.

When creating a new image or face, users can use existing images created by other users. There are sliders that allow the user to “add genes” by changing the appearance of the image (Source: ArtBreeder).

In art, this can be a way of tracing the digital origins of the creative’s work. It is primarily intended for creating digital content, not physical content from the real world. With this system, artists can be credited and even compensated (royalty) in the future if someone wants to use their work for commercial purposes.

Retouchers and anyone using a photo app on their smartphone will soon be able to edit their photos using voice commands. This is made possible by integrating natural language with imaging software. For example, a user can simply say:

“Brighten photo”

The app understands the word “brighten” from its trained vocabulary in the language it recognizes. The image is then brightened. For a more advanced user, there needs to be a way to control the granularity of the features. You can say:

“Brighten photo +12”

The app then adjusts the brightness by 12 out of, say, 100 levels. There will still be a manual way to brighten the image, but using AI capabilities that can understand natural language brings more convenience to editing.

Users can edit their photos in natural language or with voice commands using an app (Photo Credit by Victor Freitas)

In the truest sense of natural language, a user can say what they need however they want. The structure is not in the syntax, as in computer code. Instead, the software analyzes the vocabulary and sentence to correctly interpret the command. This is like interacting with a speaker assistant to complete a task.

Artists and photographers can think of these new apps as tools to add to their arsenal. It can help in image generation and just requires more creativity in the end result. An artist can create an image that they have in mind and then continue working with that image to create their work. Photographers can let the app process their image (in real time) to get the best possible result, the hallmark of computational photography. We are already seeing this in some smartphones (e.g. iPhone) that apply AI techniques in image processing.

You can also view AI as a threat to job security. If an average person can create amazing images in no time, why pay an artist? Who needs a photographer when you can take your own photos with your smartphone? Many people are already taking their own portraits or selfies with advanced features in smartphone cameras. These are disruptions that bring about a paradigm shift that forces people to make adjustments again and again.

The truth is that none of these technologies will instantly replace a good artist, photographer or model. These creatives have skills that are in demand no matter what. AI will be an alternative to the norm, at least until technology improves. You have an advantage in terms of speed, cost savings and convenience. Despite these advantages, a creative director is more likely to work with a human model and a photographer than using a virtual model for a campaign. Artists will still be in demand, because no computer can yet replace their vision.

These AI systems are also not at the same level as their human counterparts. They are examples of general AI and not something that has surpassed human ability to be creative. Computers can create stunning images with what’s available in AI, but they’re still prone to error and may not be able to produce the content users are looking for (based on the dataset).

Leave a Comment