Even if I didn’t work in technology, I would have gotten wind of how fast AI is evolving by now. Things are moving fast; no week goes by without another news story about an AI breakthrough. AI is …. Everywhere. From assisting medical professionals on interpreting scans for signs of cancer, to detecting hairline cracks in concrete for engineering maintenance, or the latest hit, the chat phenomenon which is ChatGPT. It is a construct of pattern learning and slick interaction skills that was even able to get a B- on a Wharton Business School MBA exam. ChatGPT is a gift to students and professionals everywhere. The growth and adoption curve of this kind of technology is often described as exponential. Perhaps not to the letter of the definition of exponential, the progress is remarkable and impressive.
Intelligence, or not?
The I in AI stands for Intelligence, and I think it is not exactly the right phrase for what AI is, but I’ll stick with it in this post. The Artificial part is spot on of course. Because that is what it is. These AI systems have no clue about exactly what they are looking at, classifying, or assessing. But AI does more than that. There are several companies that have applied AI technology to image and video creation. One of them is Dall-E, the cousin of the text based ChatGPT technology. Dall-E is a system that can create realistic images and art from a description in natural language. No code needed, just type what you want to see, and Dall-E creates it for you. The emphasis is on creating. This is not a search engine that finds an existing image from the gazillions of images that float around the internet. It creates one for you, from scratch. Type in: ‘a cat playing a guitar, on a beach with cotton candy’, and it will give you exactly that.
You can tell it to create it in the style of Monet, van Gogh, or give it any other instructions to get the image to your liking. No photons have been captured in this experience. The image does not exist, and yet it does.
In a way, we are already very used to this as so much of our entertainment contains artificially created elements. Full blown worlds like those in Avatar have been artificially created and we find them perfectly acceptable. We also don’t blink when we see overlays of data on top of sports fields when we are watching TV and cheering on our favorite team. The truly oblivious one is face-filters in social media apps. In the world of photography, we can replace the sky with a different one with just a mouse click. AI generating content is already everywhere. And yet, the era we are about to step into is fundamentally different. In the examples above, artificial content is added by others, or we select from a series of options. But we mere mortals do not create anything ourselves. This is about to change. The implications from this wave of AI technology can hardly be overstated. I believe we are entering a phase that will redefine what visual creation means. The same applies for text but since the topic of this blog is photography, I will focus on the visual element.
AI inside our cameras
Before I go further into why the world of photography is about to change, we need to talk about the AI inside the cameras we use. The most obvious example is auto-focus. Even basic cameras today have some form of facial recognition software guiding their auto-focus. High-end cameras can be used to detect several different shapes for focus purposes. Not just faces of humans, but those of cats and dogs as well. You can even assign a eye-preference when taking photos of birds. But it doesn’t stop there. Cars can be auto-focus tracked this way, and different sports can be selected for assistance with auto-focus. It feels like magic, which is often the case with new and groundbreaking technology.
And then there is of course the technology that is present in smartphones. I will call out two examples. Photos taken with smartphones, while still taken in the old-fashioned way by pressing a shutter, undergo a large amount of enhancements to get the most out of the tiny lens and sensor that sit in smartphones. Large portions of the code that power the operating systems of our smartphones are dedicated to creating the best possible image. We hardly notice it anymore. Some people purposely shoot RAW photos on a smartphone but 99% of people will stick to the enhancements made by the phone software. The other smartphone example is the same kind of recognition magic that also powers the auto-focus in newer cameras. It also works in searching your collection of photos on your phone or in the cloud. Type ‘cat’, and all photos with a cat in it appear.
Some of this technology has been around for a while, so why do I believe that we are entering a new phase where AI will alter photography as we know it? The AI technology described above falls into two categories; image creation without photons and photo-taking enhancements. Let’s start with the second category as this one is more of an extension of something we are familiar with, before dealing with photon-less photography which is way more disruptive. A great example is just how good Sony's bird auto-focus is for instance. here is a real life example. Or Canon, with their 'eye-control' auto-focus. This works by tracking the eye of the photographer and the camera will focus on anything the photographer is looking at via the viewfinder. In the words of professional sports photographer Lukasz Skwiot: "A typical example is where you have several players in the frame and want to follow one of them," he explains. "All you have to do is look at the one you want, and the camera will focus on that player and track him even if other players appear from the side or come between the player and the camera. Problem solved!"
Let's take this one step further. I want to ask you to think about a Roomba, one of those autonomous robot vacuum cleaners. (Or even better, think about one of those Robot Dogs from Boston Dynamics.) Now imagine some sort of tripod being connected to the back so that it stands upright. On top of the tripod sits one of the next generation of Sony A1 cameras. Or any other top-of-the-line model from any of the major brands. This will be 3-4 years ahead. Not only will this camera have all of the magical auto-focus capabilities I mentioned above, it will now also have AI in it that recognizes memorable human moments. Project the image of the Roomba with a tripod and camera to the scene of a birthday party. The camera will know who’s birthday it is and as it moves around, it will look for angles to capture the event with the person having the birthday in most photos. It moves around the room to set up the angles for composition and depth of field effect that generate the most likes on social media. It doesn’t get tired. It doesn’t need to go to the toilet. It captures the event flawlessly. This sounds far-fetched but it really isn’t. We already have robots delivering things in hotel rooms, or restaurants. And if you think about it, the combination of facial software and photography ‘smarts’ based on AI learned patterns of good photography is not that far away.
The blue pill, or the red?
The way we capture events is about to change. For photographers, the experience of taking the photograph and the photograph as the outcome are about to be split. Just like with so many things that are or will be automated, in most cases, robot cameras with AI will be better at capturing birthdays, weddings, and corporate events. For these events, outside of the photographer, nobody really cares how the photos get taken. People just want to have nice photos. Think about self-driving cars. To get from A to B (the outcome), this is perfectly fine. For people who enjoy the driving experience, not so much. For photographers, this is very similar. We enjoy the experience of taking a photo. We like to think about focal lengths, composition, and light. But in many cases, the people that get photographed don’t care about these things. They just want the outcome.
Even in a slightly less automated scenario than the birthday example I described above, soon we will get to a point where the automation in cameras is so fixed on the outcome, it takes away joy from the experience. The threshold will be different for different people. Myself, I am already quite bored with the help I get from most modern mirrorless systems today. This is because a very large part of what I enjoy about photography is the manual experience, and not just the outcome. I can imagine all this automation is a gift, for example for sports photographers, they have a job to be done and this helps them do it. But also here, I would think twice about career options as I think robotized and AI powered cameras will be able to do this much, much better in the near future.
This will become a very interesting challenge for professional photographers. For intimate settings like weddings, I can imagine people preferring a human being instead of a Roomba on steroids. But we’ll see where this goes. For hobby photographers, the question is different. It will be about how much of the experience they value. Casual photographers that like to take snaps of family on vacations and so on will likely embrace the capabilities. But for some more serious hobbyists, this might alter the way they photograph. Very much like the resurgence of vinyl records, hobby photographers might look to gear that puts the experience more front and central. This can be done by looking at analogue gear, a trend already very much underway and pushing prices of second hand film cameras and new film rolls ever higher. This is also why Leica recently re-issued the classic M6. Pentax announced looking into developing a new film camera. The desire is clearly in the market for photographic tools with very little assistance. It is one of the reasons I enjoy my Leica M so much. It is also behind my recent ‘step-back’ in digital cameras by picking up a Pentax K-1. I might even look at shooting film myself. I think we will see more camera manufacturers develop cameras that are focused on manual operations with little aids, whether they be digital or analogue.
With digital cameras becoming more and more advanced, they become more and more utilitarian. Which is great for the outcome, but not great for the experience. This is the counterintuitive aspect of modern humans. We go searching for experiences that are less utilitarian. Seen through the lens of Homo Economicus, there is no reason whatsoever to bake your own bread since the economic machine provides mass-produced bread at almost every street corner. And yet, millions of people bake their own bread. Because they enjoy the artisanal experience, the hands-on experience of taking simple ingredients and working them into this piece of bread. The recipe, pun intended, for enjoyment has a large chunk of experience in it, not just the outcome of the bread. Photography is no different. I think Leica and Pentax are brands who understand this very well.
Do we need photons?
Let’s return to the first category of photographic change: photon-less photography. This will create an even bigger rift between the experience of taking a photo and the photo itself, the outcome. Here technology will evolve incredibly fast and put tools into the hands of the public with which they can describe any scene they want and they get a perfect photo back. The technology from a photography perspective is still in its infancy, but I can already tell Dall-E to create a ‘mountain landscape with rugged peaks, some bits covered in snow, mirrored in a lake with flowers on the foreground’. You can see the result below. This is hardly something that would find its way on a wall in my house any time soon, but just imagine 3-5 years of the incredibly rapid development in this field. It will become photo-realistic and you will not be able to see if this is AI generated or shot outside.
Let me re-iterate the point about photon-less photography again. These are images created by AI do not exist in terms of light captured. Not a single photon is involved in the creation of this image. The ‘art’ created this way is already more than acceptable. The photo realistic output is still quite coarse as you can see from the example, but landscape photos created by AI will be at the same level as what the best landscape photographers can produce. But without the travel, waiting, and enduring the elements. You just type what you want to see and AI will create it. This brings me back to the point about experience vs outcome. If you are interested in a beautiful rendering of a landscape to hang on your wall, or for a magazine to accompany an article, or as a background for advertising, you might be better off using AI generated images. They will be cheaper, and better tailored to your needs because you can create them completely unrestrained from any limiting factors generated by weather conditions and light. But if you enjoy landscape photography as a experience, you will still want to head out into nature and get that shot that is unique and only you could have made because you were truly there at that time. Landscape photography will be less and less about the actual image, it will be about enjoying the experience of the trek to get to the destination and still capturing the image in front of you, but the actual image can be generated by AI without limitations in a couple of clicks if needed.
The fork ahead
It will be very exciting to see how all this will develop over the next few years. If camera manufacturers are paying attention, and I am sure they are, they will develop a generation of cameras that are developed for the experience first. The base level of photographic capabilities in cameras today is already more than good enough for 95% of purposes. It only makes sense to develop a set of cameras that emphasizes the experience side of cameras more than the outcome side. But it will take bravery to step away from the AI-powered ‘arms race’ that is underway. The disruption is already happening, the question is which camera manufacturers will embrace it and differentiate themselves via the experience, and which ones will be swallowed up by AI. The fork in the road will come. Which road will you take?