A Multimodal Approach for Measuring Item Similarity
Multimodal AI that fuses computer vision and NLP to measure item similarity the way humans do — powering smarter recommendations in tourism, e-commerce, and real estate.
When humans judge similarity, they consider appearance, function, atmosphere, and context simultaneously. Simple text matching or category tags miss most of this. We develop multimodal AI that fuses computer vision and NLP to measure item similarity the way people actually experience it.
The Challenge
Two hotels can look similar in photos but feel completely different in reviews. Two tourist destinations can share atmosphere despite different geographies. Traditional similarity metrics capture one dimension at a time — and fail at the nuanced comparisons that drive real user decisions.
Our Approach
- Visual Feature Extraction — Deep convolutional networks and vision transformers extract semantic features (architectural style, landscape type, activity level, color palette) directly from images without manual annotation.
- Textual Semantic Analysis — Large language models encode review text and descriptions to capture cultural character, climate, offerings, and subjective user sentiment as dense semantic vectors.
- Multimodal Fusion — Cross-attention mechanisms align visual and textual embeddings into a unified similarity space, validated against human expert judgments across thousands of item pairs.
Applications
- Tourism & Travel Recommendations — Suggesting alternative destinations when a user's top choice is unavailable, matched on experience rather than category. Validated against real booking patterns.
- E-commerce & Real Estate — Finding visually and contextually similar products or properties at different price points, improving discovery and reducing search abandonment.
Our AI analyzes visual and textual features of destinations to build similarity judgments that match human intuition across diverse travel contexts.
Related Publications
2024
- Warm Recommendation: Enhancing Cold Start Recommendations Using Multimodal Product Representations2024International Conference on Information Systems (ICIS)
- Measuring Flight-Destination Similarity: A Multidimensional Approach2024Expert Systems with Applications