The AI Image that has me Confused, Befuddled…… and Scared
Stable Diffusion, Attack on Titan and Junji Ito
I need help!
I have been trying to understand an image for many days now. My investigation has come to nothing, and I have found no leads. However, the rabbit hole this image led me down tells a worthwhile story about the challenges AI poses to our perception of reality, our understanding of our own memory and our interaction with the outside world.
Some time ago, I created an image using Stable Diffusion, an AI image generator. Stable Diffusion has been trained on millions of images on the internet and uses natural language processing (NLP) and neural networks to generate new images. Some results are great while others, such as those that ask the AI to recreate people’s faces or draw distinct limbs like fingers and toes, are not so great. There is still great controversy in art circles about AI art generators competing with human artists especially after the AI has been trained on images made by human artists without their consent.
The AI Art Dilemma:
One major concern that artists have is if AI is trained on enough images of an artist, then it can copy their style and technique. The style and technique of an artist is what they sell and what enables them to make a living. After giving you a thirty second description of each artist’s style, if I put three images from Greg Rutkowski, Dan Mumford and Junji Ito in front of you, you’ll easily be able to tell what artist drew what. Rutkowski’s images will be the one with medieval themes carrying dragons and wizards and giants and will be painted in watercolors with small, latticework strokes. Dan Mumford’s images will be the one that will have neon colors with images in harsh, distinct lines and Junji Ito’s art will be black and white horror manga.
These artists have spent a lifetime crafting their style and use their technique to painstakingly create single pieces of art after many hours (or days) of toil and work. You can imagine their frustration when some AI system practices on their artwork and can then recreate their images to a reasonable degree within seconds.
A Picture Worth a Thousand Words:
This time, I was asking Stable Diffusion to create a scary image in the style of famous horror manga artist Junji Ito. I left it to the system to interpret what scary meant. Within a few seconds, Second Diffusion presented me some horror artwork in the style of Junji Ito. This wasn’t a Google Search where the software trawled the internet and pulled up relevant images. Here it was generating artwork from scratch which was specifically curated to the parameters I had set for it.
The images had themes that were consistent with the violent and skin-crawling artwork that is the hallmark of Junji Ito. I scrolled through the images of worms crawling out of people’s faces and eyeballs popping out of sockets. The themes of visible and sometimes supernatural violence seemed a bit mainstream to me and so I started looking for something different. That is when I came across a couple of images with characters smiling wickedly. I think one reason these images caught my eye was that I had recently watched the horror movie Smile and the idea of taking something as innocuous as a smile and making it into something evil really resonated with me.
This is the image I liked best:
I liked the understated horror that the simplicity of this image reflected. It was only after some time when the actual reason it struck a chord with me became clear. I had seen the character before, and it was engaged in very violent and frightening behavior.
I had seen this character in the Japanese animated show Attack on Titan. I have started watching Attack on Titan many times and dropped out after watching some episodes every time. This character showed up somewhere in the first three episodes. It’s what the show calls a ‘Titan.’ A Titan is a giant, smiling humanoid creature that eats humans whole. This particular titan shows up in one of the earlier episodes when titans break into a walled human sanctuary that sets in motion the story of the show.
An image is burned into my mind of the creature from the AI image rambling into the human sanctuary with ominous music in the background, a giant smile plastered across its face. To me, it seems what Stable Diffusion did was take a frame of Attack on Titan that was stored in its database and was tagged as being scary and wicked. It put this image in manga style and colors and presented the result to me.
That seems like an efficient mechanical solution to process queries of this sort. The problem is that now I cannot find this creature in Attack on Titan. My mind tells me it has to be there. Alongside my recollection, the character in the image perfectly fits the profile of a Titan from the show. However, I have rewatched episodes of Attack on Titan and I cannot find this creature.
When I couldn’t find this character in Attack on Titan, I decided to reverse engineer the AI image. I ran it through a visual search in Google to try and what results I would get. Very interestingly, the search results directed me to Junji Ito, the artist whose style I had asked Stable Diffusion to emulate. Google concluded that this image most closely resembled his work but could not direct me to any particular drawing that the human eye could discern as being similar.
That is where we stand. There is an image my memory tells me is from Attack on Titan. It fits the mold of the show. It’s not there. We then run it in Google and Google says this image is likely to be Junji Ito’s. There’s nothing similar to this artwork in Ito's galleries though.
Could it be a bit of both? Or could it be a completely new, perfectly rendered, original AI image? I don’t know. That is the great mystery of AI art and it will become more profound with time. AI art will make us doubt our memories; it will make us question our perception of our surroundings to the extent that it will warp our understanding of reality. As the algorithm becomes more powerful, the human will pale further. And further.
Bonus:
There was another smiling image the art generator made. Here it is:
I decided to forego this image. While I think the horror quality of this image is greater than the first one, I think the lack of eyeballs in this image sort of echoes a sense of supernaturality that reduces the suspense factor.