Request a demo

White Paper

AI and Machine Learning for photography : what does it offer to consumers?

March 5, 2020  Amani Sharma

First let’s position Artificial Intelligence (AI), Machine Learning, Deep Neural Networks…often, those terms are interchangeably used, but in essence, they’re like building blocks. AI is the umbrella term for any computer software that does something “intelligent,” as its name suggests.  Machine Learning is a subset of AI where you define success criteria that then enable the machine to create and complete tasks. Deeper still (pun intended!) is Deep Neural Networks (or deep learning), which is a subset of Machine Learning. For Photographic Imaging, these networks are a set of algorithms that have set new records in accuracy for many concerns like image recognition, editing, and processing. In principle, AI means multiple things, and is usually more of a means to an end, rather than an end in itself.

Figure 2: What are the leading AI use cases in your business? (% of respondents)

Today, AI technology is making splashes in many sectors, including consumer goods and retail. Largely, most sectors report use cases of AI in their business to be linked to customer care and support – building connectivity and becoming efficient in responding to customer needs.

When it comes to the creation of photo products, there are a few areas where AI is indeed an excellent tool to achieve value and enhance the customer experience and other areas where AI still requires major funding and intricate development considerations. Here are some examples of AI’s value in our domain:

  • Context extraction and face / content recognition allows identifying people, finding out what is in the photo, how people feel, detect zones of interest for auto cropping. Google Photo is a great example of machine learning at work
  • Quality and pertinence of photos and automatic image improvement for automatic suggestions after uploading photos. Products like Perfectly Clear by EyeQ or Adobe Lightroom reflect this.
  • Grouping of related photos allows for better storytelling by avoiding putting photos from two different events on the same page (Important to note:  this usually better served and cheaper by date taken and delta grouping, rather than AI)
  • Geo-tagging images by analysing the content of photos (i.e. the pixels, not the metadata), even without built-in GPS systems, to easily compose travel projects.
  • Filters like automatic “beautification”, artsy effects, etc. are now offered by most manufacturers in camera software and many apps
  • Chat bots and automated support for providing help based on frequent questions.
  • Automatic project creation such as automatic layout, though in this case AI is not yet the winning candidate when compared to existing coded scripts or a human designer, except for the simplest cases where traditional algorithms can often do a reasonable job for less money.

In summary, currently in photography, AI performance is optimized for image creation and editing; the algorithms are principally designed to simplify and enhance the image quality and editing process, per image or as a group/category. 

At present, Mediaclip’s Research & Development team is investigating credible and economical approaches for selecting large sets of images from multiple devices (phones, cameras) and archival environments (Google Photo, Facebook, computer) and creating logical complex content, like a photobook.  Our investigations aim to uncover how AI can create real, measurable value for business owners and consumers by improving the photo selection process and providing better context for automated product creation, while still being guided by design decisions and a variety of experts. AI still has significant financial and environmental costs, and the most important metrics to track are how it affects conversion rates and user satisfaction.

Can AI interpret human perception for photo selections and projects?

The AI editing software contains sets of instructions about the optimal quality of an image, in terms of content, reproduction, density, lighting and even cropping. These instructions have been assembled under a specific set of algorithms; these sets of instructions rely on 150 years of photographic experience and quality monitoring.

For example, the Luminar AI team indicates they collaborated with top photographers to train their software neural network, so it is almost like each image gets analysed by this team of photo experts. However, the final result may not be pleasing to the user and the final decision to accept the correction needs to be made per image, as the software may misinterpret the image content and its desired impact in the storyline. 

Selecting photos and completing a personalized project is the name of the game for retailers and consumers alike. At the same time, we want to ensure the best experience for our users, and ensure that they have the highest degree of control, transparency and efficiency while working on more complicated high-engagement projects like photobooks. AI and Machine Learning should eventually help the creative process, but not currently.

Why? Because inserting these high-level and complex algorithms for best image selection, pleasing content and accurate storytelling requires special computer analysis, multiple scans and classification of each image in a selected set. These efforts are time consuming, and require costly computing resources at present.

A substantial set of diversified instructions need to be integrated (time-line, metadata, face recognition, good behavior judgement, deciphering between useful and useless images), all of which must be evaluated from a large set of the consumer image data to propose a pleasing and efficient “one-click” automated image selection. Currently, most eCommerce sites do not have these resources. Nick Burns, a Data Scientist, contends that no matter of how great AI models are, they are merely at par with the data that is available to them.

Therefore, using AI and Machine Learning to perform image content selection on a small and even large set of images that will convey the exact intended emotion and story can still be quite challenging. For example, in a series of portraits, face recognition is important without a question. But is the subject winking or blinking? Is the subject well-postured? Or, is everyone smiling in a group photo? If not, then which image is the best one to select? In other circumstances, certain set criteria that provide the “knowledge” to the AI may not be optimal for the actual memory. For example, a photo with low light may carry the right feeling and should not be rejected or auto-fixed, or certain “imperfections” on the subject give it its character/individuality and should not be softened. Because of nuances like these, proposed selections and image corrections still need to be confirmed by the user.

As part of our research and development efforts, we are experimenting with integration possibilities of these sophisticated algorithms from an economical and technological perspective. Our process to achieve optimal one-click images selection is regularly under review to ensure it offers the best possible scenario for our customers, their consumers and the overall process. Currently, our focus lies in improving the algorithms that simplify and optimize the content of complex page layouts. In our analysis, this yields a higher ROI for the business and a more intuitive end-user experience.

In summary, AI Deep Neural Networks would require years of data analysis about the user’s perceptions of images or memories, in addition to his/her intent and behavior, to be efficient enough for one-click complex image and project suggestions. The layers and profoundness of a human mind when making decisions (among other things) are unquestionably more sophisticated than AI and unlikely to be replicated easily in the near future.

Artificial intelligence versus traditional software development to achieve an automated one-click photo projects (like photobooks) in 2021

As we discussed previously, AI and machine learning do a great job in helping to manage and enhance images (i.e. automatic corrections, subject grouping, facial/object recognition, etc.).The features offered by AI-backed software are getting better and can tackle more problems at a staggering speed. Many product personalization solutions use it to enrich user experiences. In situations where images are already available for analysis, such as in archival solutions like Google Photo or mobile apps where photos are already on a single device, AI provides extraordinary value in cataloging user image collections in a meaningful and useful ways. In some cases, they can even use that cataloging to propose products with limited design options.

The limited design option is the core issue here. AI is an excellent tool to extract a meaningful subset of images from a larger collection, and can help to guess a sequence when there is a common context such as a wedding or a photo shoot. However, AI is not used to create a quality design or to make the best use out of the images. 

Why shouldn’t we use AI for product design too?

To answer this question, let’s compare artificial intelligence versus “normal” software development by experienced developers. Both approaches provide a framework of tools to potentially allow for building designs, but neither knows anything about design.  Artificial intelligence, more specifically machine learning, can figure out how to do something based on large data sets and success criterions. Building meaningful data and tweaking what is considered a “successful design” is extremely costly and time consuming. Where AI shines is in scenarios where it’s non-trivial to codify what constitutes a good implementation. In the case of designing a book however, we can codify what constitutes good design decisions. Safe and normalized margins, as well as aesthetics and content-driven layout rules can be provided by design experts and implemented for a fraction of the cost, while still providing variety. Sure, the black theme with one photo per page works, but to keep feeding new, interesting, and relevant content to users is still much more cost effective with traditional tools.

But in the end, we don’t have to choose between artificial intelligence and traditional software development. Both can be used in building great solutions. For example, we can use today’s artificial intelligence strengths to feed traditional algorithms with meaningful information, such as whether a photo has a higher pertinence than another, without necessarily having it determine where that photo should be on a page.

Using artificial intelligence to its full potential today is currently restricted by its total cost of ownership. It is a challenge to economically connect all the required building blocks to make the most out of all this data. Licensing costs, computing requirements, poorer performance, and a decrease to the bottom line revenue are key barriers to large scale adoption.

A down to earth question that we should ask ourselves as an industry is whether all of those features that artificial intelligence bring are actually needed? In theory, having photos grouped by colors, discarding images based on quality and using advanced metadata to make design decisions are great and make for amazing presentations, but will they actually drive the sale? What will make the customer excited, and will they even understand what’s going on? Should we educate end users on what AI can bring to them, or should we just make things simpler for them?

We still have to make the case for individual artificial intelligence features, so that investing in them at this point in time can generate revenue and improve the user experience today. At the speed by which artificial intelligence innovations happen, we can expect the “financially viable” possibilities to continue growing. We can also expect more data on what works and what doesn’t so that investments can be justified.

Let’s also explore why aesthetics of AI-proposed project are important and pose as a key barrier to widespread adoption. Imagine that a current AI system had to create a photobook from your photos. Would you let it choose which photos will make it (or not) without looking at them as well? Would you trust it to guess if this is for a gift, a wedding ceremony or a simple family souvenir? Would you settle for only black pages, and centered images? Would you like to choose how much you want to pay for that product and the level of quality of the paper, binding and so on? And, perhaps most importantly, do you believe those answers could be the same for everyone?

Fully automated product building requires pre-determined discrimination algorithms, which can either be based on “what usually works” or on carefully selected input from the user. So, can AI actually choose a logical aesthetic sequence without some contextual instructions from the consumer? Well, yes, as long as you’re okay with a generic book. And even then, making a generic book with traditional software development will be much cheaper and easier to improve as you learn about your specific market.

AI can aid in managing large data sets and making automated decisions, like triaging images and understanding what’s in an image, but again, at a high cost and with additional delays from the user’s point of view because of the required computer workload, especially in the context of an e-commerce store. Analyzing a large set of uploaded photos to regroup, categorize, and put into a pleasing sequence on websites can easily add a few dollars per created book, regardless of whether it was ordered or not. Is the additional revenue worth the financial risk? Consider the following:

  1. When shoppers enter a site and upload large amounts of pictures for their projects, yet on average 20% end up not purchasing, the business will incur computing costs that are not all offset by sales.
  2. Analysing voluminous loads of photos can take several minutes. Consumers perceive this “crunch” as a significant production lag. Today’s speed-driven consumers don’t accept delays, even when warranted by complex analysis, and these delays affect conversion rates.
  3. The engineering cost of stringing and initiating these AI building blocks to create auto-suggested layouts is very expensive. Since the resources required are constantly changing, knowledge, technology and future engineering requirements are scarce and pricey.

While Mediaclip is constantly appraising AI and other progressive technologies, we found it is more flexible and cost efficient to write algorithms that perform intelligent tasks like regrouping photos, generating pleasing product themes, optimizing book layout scenarios, and interestingly placing one or many images while respecting their image content ratio on each individual page/surface, all at a reasonable price and with the support of expert design, while retaining the variety consumers are expecting.

Why Mediaclip will not jump on the AI “hype train”… yet

Mediaclip’s R&D team constantly experiments with different options to achieve rich photobooks that tell a story by enabling users to express themselves while balancing the time it takes to create them.  We see many techniques today that will help us use AI in a lot of very useful ways, but we did not find a solution that can replace carefully crafted designs and algorithms in a cost effective way.

There are specific things we have yet to see before we recommend AI as a core engine for product personalization:

Creating compelling designs and layouts – Our smart design system already allows for rich and interesting designs that cater to all sorts of life events and product styles. Users can easily reorganize and recompose their pages based on the best possible layouts for their needs. Existing image intelligence services can help to automatically enhance images or to automatically crop images when the design requires it, for example when using licensed content. This is all currently available without using AI-based tools.

Cost effectiveness -There are currently two methods to provide AI-based service; client AI (usually available on mobile devices to scan a user’s library of images) and server-based AI. There are some great solutions for providing product recommendations on a mobile device; however, both methods can be much more expensive than their non-AI counterparts. Note: we believe that (specifically) for mobile apps’ “photo discovery” and curation capabilities, AI offers very strong options to consider today.

Control over the result. AI, more specifically machine learning, is excellent at guessing a method that will reproduce what you feed it. However, this also greatly reduces your ability to tweak and adapt the algorithm for your needs. If you believe an image should be handled slightly differently, or if your design team would like to make a specific design decision based on an image orientation, they’ll be severely limited.

We were able to create impressive demos using existing solutions and prototyping ideas using proprietary and open-source software. However, our development efforts focus on conversion rates, user satisfaction and increasing revenue for our customers and partners, not making “cool” showcases. We are not satisfied with the current solutions and cannot recommend jumping on the AI ship just yet. We do believe however that this is a field worth our time and investment as the associated costs decline and the offerings gets better.

In conclusion, we believe AI is an extraordinary, albeit pricey, tool to help photo curation on mobile devices and on photo hosting sites. It can also provide meaningful insights into user content and help drive how you present and communicate with users on your site. However, these insights and benefits come at a cost that doesn’t yet make it a ‘no brainer’ decision. Keep in mind, Artificial Intelligence is not afeature; it doesn’t do anything by itself. It is a method that can solve specific categories of problems efficiently. Remember the saying, “when you have a hammer, everything looks like a nail?” Well in this conversation, AI is a demolition hammer; it’s an extremely helpful tool to have in your shed but it won’t help you decorate.

The limitations of current AI solutions are not negligible. Sacrificing choice and style in favor of layout intelligence, when that intelligence is already available at a fraction of the cost with traditional coding, makes AI a less compelling option for now. Of course, there may be early tech adopters who are not deterred by certain levels of business risk when using AI-based solutions for layout intelligence.  We, too, believe in AI and it is merely a matter of time until we discover a cost-effective way to utilize its capabilities in a way that are beneficial to both our customers and their shoppers.

Now, don’t get us wrong, we are thrilled with the myriad of possibilities of artificial intelligence. We are amazed by the constant new developments and can only envision positive opportunities moving forward. It would, however, be irresponsible to recommend using AI-based solutions at this time, as they’re not yet economically viable options to use for large-scale projects compared to traditional approaches that yield the same results. We are eager to discover more data-driven research and experiments on specific AI-based features that our customers can apply to their own business sector and user personas. We will continue working on assessing solutions and investigating the effect on the important metrics like conversion rates and user satisfaction.

By all means, the choice is yours to decide if AI answers the needs of your business and/or the demands of your shoppers despite its current high cost. However, please make it outside of the marketing hype – due diligence is as crucial here as in any other business venture.