I am delighted to publish this discussion on widely popular AI Art Apps between our award winning mobile artist and resident tutorial editor, Jerry Jobe with award winning mobile artist Meri Walker (iphoneartgirl). It is very interesting, enjoy! (foreword by Joanne Carter).
The mobile art world has been overtaken recently with artificial intelligence (AI). Apps like Wombo Dream and StarryAI take an input text prompt and create wonderful images from it. Naturally, after the initial experimentation has taken place, conversations have arisen on the implications of this new method of creating images.
This isn’t a new conversation, of course. AI tools have been in existence for years, helping mobile photographers and digital artists with tasks such as resizing and background erasure. Preset painting styles are a form of AI, and are omnipresent in apps like Deep Art Effects, Prisma, Vinci, Painnt, BeCasso, Deep Art Effects, Prisma, Vinci, Painnt, BeCasso, and ArtCard. Voila! AI Artist deals entirely with faces, allowing you to create a caricature, cartoon, or renaissance portrait. PortraitAI uses AI to reconstruct an input face from various classic paintings. Dreamscope allows you to influence the results by inputting an additional image into the machine to copy its “style” onto your image.
All of these apps have been derided at times by the digital art community, especially those who come from an analog art background. This is understandable – any task you have spent years training yourself in, when performed in minutes or seconds by a machine, seems cheapened. Attitudes soften towards them over time, mostly because of two reasons.
First, the results are usually obvious. Styles in all of these apps have canned colouring, textures and “brushstroke” looks. Artists quickly learn that they want to modify the output to make it look more like their own, using filters as a tool towards a finished image rather than the image itself. Second, the underlying image is provided by the artist. A photograph, a texture, a drawing, an abstract collection of shapes. Artists creating from a photograph quickly learn that there is no way to save a bad photo, so you’d better start with a good image.
That brings us back to text-to-image (TTI) apps, and how this conversation changes for these apps. TTI apps collect their own starting images, then warp and collage them according to a style. By taking away the original image from the artist, many issues arise. Those issues are the subject of this discussion.
The plan is to pass this back and forth between myself and Meri Walker (iPhoneArtGirl). She, unlike me, has many years experience in analog and digital photography and art. We look at things from a very different perspective – our minds work in different ways – and I hope this conversation will be a fruitful one.
I’d like to start my part by pointing to an online conversation I had recently with two different people because it brings up two aspects of the issues with TTI apps. Person A asked about the source of the images – since the AI would do an image search, like Google, what would be the likelihood of it picking up a copyrighted work? Every artist is concerned about their work being used without permission. I explained that any image is cut up and warped beyond recognition, then merged with other sources as well. I proposed an experiment of using “Statue of Liberty” as a prompt. Since the results were only reminiscent of Lady Liberty, there was no chance of matching it to a particular source. The copyright of the source image(s) does not matter in the least.
Person B hit upon my main issue with TTI images: In what way do they belong to us? In what way are they ours? My answer to this is not a legal one (Wombo says while you are able to use their results, so are they). My issue is a personal one.
I am simply not able to call the results from a TTI app my own. I can’t just crop, resize, sharpen, and tone and then present them. I didn’t really DO anything to make them. However, I love the possibilities presented, using them as a background, texture, or even a compositional inspiration. I think that many who are experimenting with TTI apps right now will come to the same conclusion. After the “wow, look at this!” phase, they will settle down to using it as a tool to improve their own creations.
It was mid-December when I first saw Jerry post an image he had made using WOMBOdream on social media. After examining it carefully and thanking him for the heads up about the new app, I did just what I’ve been doing for the decade we’ve been learning together in the mobile art community. I went to the Apple App Store, downloaded WOMBOdream, searched for any other apps that might be found under AI Art, downloaded two others, and started playing with them all.
The WOMBO image outputs were so weird I wasn’t quite sure what I thought or felt about them. They reminded me of a book of poems I was given. They were composed by a young man who couldn’t speak or write. He “dictated” the poems to his mother as she held his hand and slowly moved it across a lapboard covered with the letters of the alphabet. The words of the poem were created when he paused as they reached a letter he wanted to use. The syntax and meter of the poems was quite odd and the punctuation totally irregular. But each one had a plausible “message” and I found them moving, however strangely formed. Having raised a multiply handicapped child, I learned long ago to appreciate a wide range of human differences and to celebrate the efforts of others who couldn’t achieve the standards of performance I held for myself.
This was the mindset I brought to my WOMBO experiments. I described them as naive, wild, and yet somehow plausible. Because they didn’t meet the standards of execution or representation I had been educated to expect from visual imagery, I judged them the way I would have judged my son’s drawings or the poems by the handicapped poet. I cut them a lot of slack.
To begin with, I picked a WOMBO image from dozens it generated using one of my prompts. I chose to edit it into something that met my standards for a competent representation of the title (the prompt) by adding some marks of my own and changing the lighting. When I posted the image on my IG account, I credited WOMBO because it’s a new tool. And, besides, it only seemed right to give my followers a way to understand how an image like that one could end up in my image feed. As different as my mobile images are, one from the next, that one stood out like a sore thumb to me and I figured others would see it that way, too. I wanted to share my excitement about the new tool without misleading viewers to think I made it myself.
As I generated more images using WOMBOdream and NeuralPaint, I found myself wanting, more and more, to edit the images I received from the synthetic generator, if only to increase my personal self-expression. Anything else seemed like image theft. Admittedly, it’s one thing to steal from a human creator (by using a “free” stock image generated by another human creator in an image I say I authored. It’s quite another to appropriate a visual image generated by a machine algorithm. Nevertheless, I still see it as theft to take a synthetic image mathematically generated by a machine and call it my “artwork.” Regardless of the emotional outcries of people claiming they created the images they’re now posting after feeding their “unique text prompts” into the machine, it’s still true that the machine made the image based on all the other images that have been put into its database made by other human beings. It’s not not an artwork executed by the text prompter. The simple definition of a “visual artwork” I use is an image that results from work done by a visual artist.)
Within ten days, I found myself getting increasingly frustrated waiting for WOMBODream and other synthetic image generators to complete images that satisfied my curiosity. I got bored quickly with the limited colors and styles. I became less and less tolerant of the incompetent drawing, painting, and compositions. I felt more and more like I was trying to enjoy what I was doing instead of actually enjoying it. So I stopped.
At this point, after my last thirty days of experimentation, my participation in conversations in AI art groups, and a half-dozen serious chats with friends who are longtime mobile artists, I have to say that I see the current round of mobile “AI art apps” the way I see video games. I say this based on the simple fact that they are both types of human-computer interactions defined by the program running on the computer (desktop or mobile) and not the human being. The human experience of visual image creation is defined and confined by the machine’s terms of engagement. Users put in the prompts and then we wait. Users “win” when we get the results of the machine’s program. Not the other way around. Many people enjoy these kinds of games. I’m just not one of them. It’s too much like playing slots.
I also see AI image-generators the way I see dog training. In both instances, a competent human being engages with another party (a dog or a machine) for the purpose of getting the other party to perform what the human uses language to direct them to do. When training a dog, the prompt is oral, but the engagement is much the same. When we direct Rover to “roll-over” on command, what we get is a series of successive approximations of the behavior we want. At first, most, if not all, are wrong. With increased training, the dog learns to get closer to doing what we want. Rewards help speed up the process. We “win” if and when Rover actually rolls over instead of landing half-way on his side or just lying on his back to get a treat AND Rover recognises that’s what you wanted from your prompt. You know this because he can do it again when you prompt him the same way.
Human beings can and do glean a sense of accomplishment and satisfaction from playing this way with our pets. The successive approximations can be funny and endearing. But our prompts don’t make the dog roll-over. We don’t do it. They do it themselves, usually to get our reward. It’s a one-sided power game.
In the case of a text-to-image AI app, the human user inputs a series of text prompts and may – depending on how the database was built and the algorithm constructed – receive a satisfying approximation of what we expect to see in the image output, based on the elements we specified. We may repeat this process many times, attempting to get better approximations of what we imagined when we chose the words to input. But, even with repetition, we don’t make the machine “understand” us or “express our intentions.” The machine simply does what it was programmed to do: It copies and assembles visual elements in a defined space based on its training to recognise the language we used and composes the elements based on images previously made by other human beings that were fed into its database. Then it rolls its image out to the user.
Again, this is a fun game. For awhile, I found it a pretty good way to generate some dopamine for myself by thinking up some language, typing it into the engine, and waiting to be surprised by what popped out. But it wasn’t me drawing, painting, photographing, and composing elements I made or surprising myself with my creative process. It didn’t take long for me to tire of the game and go back to using my own skills and processes which net me a lot more consistent satisfaction.
I want to say I am feeling sadder and sadder as I see more people posting and claiming authorship of “artistic” images on social media when all they did was accept and celebrate a relatively incompetent drawing made by WOMBOdream. I have to say I’m confounded when I see people raving about how therapeutic it is that a synthetic image-generator took their creative intentions and generated just the image they needed. This is personifying and anthropomorphizing a human interacting with a machine. It’s fantasy. It’s not human image-making.
At this point, all the AI art apps I’ve investigated appear to me to do the opposite of liberating human creativity. They are not creativity tools. They turn human users into passive viewers, not active makers in an image-making process.
At best, the current round of AI apps offer users free use of mostly incompetently rendered images and allows them to present the images as their personal “artistic” effort since the license to use the apps allows them to “own” the machine output. This is just the opposite of what the best of our mobile imaging apps do. I think it’s dangerous play and I am not excited to see the app developers exploiting users’ time to train their machines while leading them to see what they are doing as participating in a “collaborative imaging process.”
From where I sit, now 14 years into this new movement, I don’t see these AI apps helping users to build their skills at drawing, painting, photographing, collaging or composing meaningful visual images. In fact, I see them putting users into what they, themselves, have celebrated as “obsessive” and “addictive” states. Like we need any more addictions right now. If visual communicators use their limited time on Earth to put keywords into search or image-generation engines, grab images they didn’t make, lay filters on them and call them their own, they have just lost a ton of time they could have been using to develop their own skills.
We don’t all have to be Leonardo to make meaningful, communicative visual images. But if we want to derive sustained or increasing satisfaction from our creativity, we need all the time we have to practice using tools that help us make our own images.
I can certainly understand Meri’s consternation at how TTI apps are currently being used. It does seem that users are replacing their artwork with machine-generated images rather than enhancing their artwork. This really hits home when users are entering words to instruct the AI about camera settings like depth of field. When an AI is mashing shapes together to get an approximation of a hydrangea, it is not competent enough to put the correct portion of the image into focus and blur the rest.
At the same time, it isn’t as though AI is totally valueless. It needs to be approached as a tool, able to perform tasks that you need more quickly and easily than you could yourself. If I’m wanting to draw a perfect circle, I reach for a compass. If I’m digitally creating on a computer, I reach for the Shape tool. If I ask a TTI app for a “perfect red circle on a blue field”, it can’t do it. So we need to find the uses for which AI apps are the correct tool, and use the apps as tools to achieve our own artwork.
Here’s an example: while experimenting with WomboDream, after seeing an image of Venice, I entered the words “submerged gondolier”. I got a result that implied the figure was only partially submerged. It was only a very vague implication, and it really wasn’t a good image overall. But I thought, can I fake a “partially submerged” look using my other apps that I actually control? To map out the necessary steps, I used a “forest path” image from WomboDream. So I used WomboDream in two different ways: as a source of inspiration and as source for a mock-up of a technique. Now when I encounter one of my own images suitable for partial submersion, I’ll know how to go about it.
Another example arose from entering the prompt “lady minimalist”. I got back a mostly grayscale image with a few inky shapes that certainly suggested a woman’s face in profile. To me, a grayscale automatically suggests using it as a mask. By masking in colors and textures, as well as an additional circle, I created “She Loves the Sunsets”. I used the image as a basis for my own work, but only as a stencil. In a way, the machine cut the stencil, but I was the one who gave meaning to it.
I am seeing what Meri pointed out: that a few artists are becoming obsessed with text entry as art. This is dangerous, both to your vision as an artist and to your pocketbook. Dreamscope, which not only restyled images but stored the results on their servers, is quietly going out of business while still taking monthly fees. Users will lose years of work.
But if you treat AI as a tool, only using it to help construct your own work, it’s probably okay. Look at it as using an air hammer rather than a cold chisel – both have their place, and neither is creating art.
Please support us…
TheAppWhisperer has always had a dual mission: to promote the most talented mobile artists of the day and to support ambitious, inquisitive viewers the world over. As the years pass TheAppWhisperer has gained readers and viewers and found new venues for that exchange. All this work thrives with the support of our community.
Please consider making a donation to TheAppWhisperer as this New Year commences because your support helps protect our independence and it means we can keep delivering the promotion of mobile artists that’s open for everyone around the world. Every contribution, however big or small, is so valuable for our future.