By Oliver Tan, Co-Founder & CEO, ViSenze
The rise of visual and voice search has transformed the way users discover, search and buy products online. No longer are shoppers solely relying on keyword-based searches or typing endless queries into search engines. Today, we have clever ways to discover products, from using our voice to searching with images and sampling products in 3D or in augmented reality. These cutting-edge options are available on most major marketplaces such as Amazon, Flipkart, Shopee, Rakuten and Walmart. In fashion and lifestyle shopping for instance, the natural ease and convenience of using visual search, particularly in fashion and lifestyle shopping, have captivated users, especially younger shoppers in countries with mobile-first experiences. Over at web search, image search on Google drives about 19% of all search queries according to Google Trends.
While these AI advancements in vision, text and voice recognition have greatly improved the shopping experience, the AI models behind these search methods typically operate independently from each other, there is no cross modality. Enter multi-search, a groundbreaking vector-based approach that brings together the power of natural language processing and computer vision to decipher intent and seamlessly connect it with desired products. By mimicking the personalized guidance we receive from in-store associates, multi-search leads shoppers into a more sensory-driven journey that overcomes the boundaries of traditional text search, bringing their shopping experiences closer to the real world.
A new era of search: the multi-modal paradigm
A present day example of multi-search in action is Google Lens, a powerful tool that integrates different modalities simultaneously, including text and images, to execute a single search query. This text-image search combination makes it easier than ever to find products, even when we lack precise knowledge of the product’s name or where to find it. According to Google Trends, Google Lens processed over 12 billion searches a month as of February 2023, up from 8 billion searches in 2022, and 3 billion searches in 2021. Besides Google Lens, there are also several AI start-ups developing similar multi-search capabilities today.
Consider this: you’re looking for a light summer floral dress with puff sleeves. With multi-search, you can capture an image of an inspiring outfit that caught your eye for its attractive patterns, prompting the multi-search engine to find either an exact match or visually similar dresses – depending on your choice. Faced with different size ranges, you swiftly refine your search results by specifying only petites, ensuring that you only see petites and what’s available in stock for purchase. Bingo, you found two final available choices that fit your budget.
The impact and advantages of multi-search in retail:
Improved convenience: By combining text and images, multi-search significantly enhances the accuracy of product search results. Users can now search for products they have either a name or an image of, eliminating the need for text input altogether.
Enhanced discovery: Multi-search facilitates the discovery of new products by seamlessly converting text and images into “vectors” which machine learning models can deeply understand. Users can now stumble upon products they may not have been aware of or might not have considered before, expanding their horizons, and exposing them to new possibilities.
Inclusive product search: Current product search engines often exhibit biases toward well-known or heavily advertised products. However, multi-search helps level the playing field by enabling users to search for products based on their unique or preferred criteria, such as color, pattern, style and/or price, ensuring a more diverse and inclusive shopping experience.
Personalized search: By incorporating text and images as vectors, multi-search can retain context and learn about users’ preferences and interests. This valuable information is then leveraged to personalize product search results or recommendations, increasing the likelihood of users finding products tailored to their tastes and needs.
Visual engagement: Images are a powerful medium for conveying information, and multi-search capitalizes on this by presenting users with visually matched products. Visual driven search strategies like Shop-The-Look or Complete-The-Look in fashion shopping incorporate user intents into the equation. This enhances the overall product search experience, enabling users to make quicker and more informed decisions, allowing them to engage more deeply with the products they intend to purchase. Such visual engagements can also extend to video content to drive even more shopping transactions.
Challenges to implementing multi-search?
While the potential of multi-search is huge, developing and implementing multi-search capabilities is not without its challenges.
- Data collection: Retailers need to collect and store vast amounts of customer preference and behavior data to train the multi-search algorithms effectively.
- User-friendly interface: Developing a user-friendly interface that allows customers to search for products through voice, images or text is essential. The interface should be intuitive and accessible, even to those less familiar with technology.
- Accuracy and reliability: Retailers must ensure that their multi-search algorithms are accurate and reliable to provide customers with the best possible shopping experience. This necessitates continuous improvement and fine-tuning of the algorithms.
The AI challenges in multi-search are real but the good news for retailers and brands is that there are several AI startups, including Visenze, that are developing multi-search AI solutions for brands and retailers to deploy on their platforms, giving them similar capabilities to what Google Lens has.
Search is becoming more natural, intuitive and multi-purpose. At the intersection of content, commerce and consumers lies the future of product search and discovery. What shapes this future greatly is multi-modal AI – including the rise of generative AI with ChatGPT. Collectively with personalization strategies, multi-search empowers shoppers to discover, search and find their desired products through their most preferred means of conveying intent, be it in any language, human expression, keywords, images or even voice. While multi-search is still in its early stages, it has definite potential to revolutionize the way we shop.
A better search experience
As we witness the evolution of product search, powered by the convergence of AI and multi-modal capabilities, the shopping experience will continue to evolve and improve. The amalgamation of text, images and semantics will redefine the way users explore and find products online, leading to greater convenience, enhanced discovery, inclusivity, personalization and visual engagement. By embracing the potential of multi-search, retailers can stay at the forefront of this transformative shift, providing shoppers with seamless, intuitive and highly satisfying shopping experiences.
About Rakuten Optimism: On August 2-4, Rakuten Optimism 2023 will bring participants from Japan and around the world with some of the world’s top luminaries and business leaders to reflect on the myriad ways in which the world is changing around us, and how our lives can be enriched as a result.
Register for free for Rakuten’s biggest-ever business conference today: https://optimism.rakuten.co.jp/en/