With technology continuing to move on at a swift pace, there’s been plenty of recent discussion as to whether digital renders can truly ever replace product photography. Taking this one step further, is it possible that one day, artificial intelligence could simply create images without needing any input from a photographer or digital artist at all?
As photographers, we often marvel at how amazing modern technology can be, how magical that new “must-have” camera feature is, or how smart the image-processing software has become. I don’t consider myself to be especially old, but when I think back to using a manual-focusing 35mm SLR (because that’s all we had to use, not because I’m a hipster) and compare that experience to the incredible face detection or eye detection autofocus on modern mirrorless cameras, it’s hard to believe these huge technological advances have happened within my lifetime. Even the act of sitting in my living room, controlling the lighting and home entertainment with my voice, or video-calling a friend in another country on an iPad are literally things that my child self would have considered science fiction. Even my smartphone is significantly more powerful than any computer I had access to before I well into adulthood. In the grand scheme of human history, the time it’s taken us to get from the first commercially available camera for the general public to smartphones with very efficient digital cameras in the pockets of most people in the Western world, is amazingly short.
DALL E mini
This week I saw some funny images posted on social media from a project called DALL E mini. They were crude images of very random things like Joe Biden eating a hot dog or spiders wearing sombreros. Silly images on the internet are nothing new, but these were supposedly produced by artificial intelligence. Some of these images looked like simple drawings or cartoons, where others looked like renders lifted from a video game in the early 2000s. None were what I would consider realistic. Most of the images shared on social media were quite silly, so naturally, I wanted to look up where these images had come from. A short Google search took me to DALL E mini. This Transformer-based text-to-image generation model was designed by Boris Dayma, Suraj Patil, Pedro Cuenca, Khalid Saifullah, Tanishq Abraham, Phúc Lê, Luke, Luke Melas, and Ritobrata Ghosh.
DALL E mini is very simple. You type in a short text prompt, then the AI, which has been trained on unfiltered data from the internet, gets to work and produces nine images based on the text stimulus. These images usually vary quite a lot from each other, but represent the AI interpretation of your input, based on data on the internet. Right now, it’s not especially fast, taking between two minutes and four minutes to produce images which are of questionable quality at best. After playing for far longer than I should have, I can see that it’s nothing more than a goldmine meme right now, but as a concept, it’s fascinating, with exciting future possibilities.
The model is intended to be used to generate images based on text prompts for research and personal consumption. Intended uses include supporting creativity, creating humorous content, and providing generations for people curious about the model’s behaviour. Intended uses exclude those described in the Misuse and Out-of-Scope Use section.
It’s worth noting that these images are created by an artificial intelligence, which was trained on unfiltered data found on the internet, to produce its own interpretation for the search terms users give it in the form of a selection of basic images. It’s also worth considering that people on the internet are using their own creativity and imagination in asking this AI to create things for comedic effect. If you’re planning to look at the discussion board or try the image generator yourself, be aware of the bias and limitations text provided by the dev team, and be aware that some people on the internet are jerks who will find it funny when AI produces questionable or offensive images.
While the capabilities of image generation models are impressive, they may also reinforce or exacerbate societal biases. While the extent and nature of the biases of the DALL·E mini model have yet to be fully documented, given the fact that the model was trained on unfiltered data from the Internet, it may generate images that contain stereotypes against minority groups. Work to analyze the nature and extent of these limitations is ongoing.
The Future of AI-Generated Images
It’s probably safe to say that no photographers will be losing their jobs to AI any time soon. This technology does, however, raise some questions about what the future of imaging might look like. We now live in a world where stock images are available online in seconds for anyone who needs a generic image. Sure, stock images were taken by a creative professional who will make some income from them, but what will happen when machine learning gets to the point where some generic images can be created by AI? Who owns the rights to those images? Could this one day replace a large portion of the stock image industry and be of detriment to stock library photographers? Could we one day see renders of products or places produced entirely by a machine algorithm being used for commercial purposes?
Memes and silly images aside, I wanted to see how close this system is to creating a lifelike landscape, so I gave DALL E mini a simple text input to see what it would make of beautiful landscape as a text input. Here’s the image it produced this morning. Watch out, landscape photographers! The machines are coming for your jobs!
I appreciate that from the look of the images produced today, it seems like a stretch to think it could ever replace a professional photographer, but 30 years ago, an iPad and FaceTime were the stuff of science fiction, yet now we all carry tiny powerful computers with high-megapixel digital cameras in our pockets every day. The possibilities for the future are exciting or terrifying, depending on your point of view.
Renders, which are still created by human beings, are taking over from product photography in some places. Is it only a matter of time before digital images are so lifelike that we won’t need real-life photographers in as many situations? Is it possible that there won’t be a need for commercial photography at all one day?
What do you think about renders or AI replacing photography? Is this technology exciting or worrying? How far off might a legitimate commercial use for this technology be? Let me know in the comments.