How We Controlled AI Hallucinations in a Luxury Travel MVP

Andrzej Błądek - Senior Python Developer
7 minutes read

What happens when an AI confidently recommends experiences that don’t exist? In luxury travel, that’s a fast way to lose user trust. Here’s how we designed a system that prevents hallucinations before they happen.

Introduction: Why hallucinations matter in premium digital products

Large language models are powerful tools for building conversational interfaces, but they come with a well-known limitation: they sometimes generate answers that sound convincing but are factually incorrect.

In many consumer applications, a minor factual error may be inconvenient but not critical. In a luxury travel marketplace, however, inaccuracies directly affect user trust. Suggesting unavailable experiences, misrepresenting destinations, or fabricating details about partners would undermine both the product and the brand behind it.

When building The Occasionist Studio, a curated, AI-assisted travel platform (see the full case study), we treated hallucination control not as an optimization task but as a foundational architectural requirement.

Practical lessons from building a conversational travel marketplace. See how we designed a production-ready architecture that minimizes hallucinations and protects user trust in a premium digital product.

Common sources of hallucinations in LLM-based systems

Before integrating the model into the product, we analyzed where hallucinations typically originate in applied LLM systems. In practice, they often stem from:

  • Overly broad or ambiguous prompts

  • Allowing the model to generate operational data

  • Embedding business logic directly into prompt design

  • Providing excessive context without constraints

  • Failing to design structured fallbacks for uncertain outputs

During app testing, we observed that the model recommended specific destinations and experiences to users, despite lacking access to our internal database of active offers at that time. Consequently, the AI generated hallucinations by suggesting attractive travel options that our application did not support and could not fulfill. This presented a significant product issue, as it misled customers regarding our actual inventory and set expectations that were impossible for us to meet.

Understanding these patterns helped us define clear boundaries for the system from the outset.

Project context: Conversational onboarding in a curated marketplace

The Occasionist Studio replaces traditional form-based onboarding with a conversational experience. Like many early-stage digital products, the platform began as an MVP designed to validate the concept quickly before expanding functionality. Preparing for that stage often starts with the right product discovery process, something we describe in “First Time MVP Meeting? Here’s What You Need to Know”. Instead of completing multi-step forms, users interact with an AI-powered concierge that gathers preferences and introduces relevant travel concepts.

However, the conversational layer was designed as an interface to structured business logic and verified data, not as an autonomous decision-making component.

The platform integrates:

  • Verified destination management companies (DMCs)

  • Structured travel data stored in Supabase

  • Backend rules governing associations and constraints

  • Orchestrated AI flows managed through n8n

n8n serves as a direct bridge to our backend, where the chat agent extracts key insights from conversations, formats them, and commits them to the database. These stored data points are subsequently used to fuel the dialogue and generate highly personalized travel recommendations. By leveraging the n8n Structured Output Parser, we ensured that all extracted data strictly adheres to our required schema, maintaining consistency as information is captured and retrieved throughout different stages of the conversation.

This separation between conversational output and operational logic became the primary mechanism for reducing hallucination risk.

Design principles for controlling model behavior

Before writing production prompts, we established several architectural principles that guided implementation. Forming clear architectural principles early is essential when building scalable MVPs, especially when AI components are involved. We discuss this approach in more detail in “How to Bring Your MVP Project to Life Successfully”

1. The model does not generate operational data

The LLM was not allowed to create new destinations, pricing ranges, partner names, or availability information. All such data had to originate from validated database entries.

Instead of asking the model to invent or estimate, we injected structured variables into prompt templates.
To maintain a stateful conversation, we inject real-time database records directly into our prompt templates. This ensures the LLM has a clear understanding of the current "work-in-progress" data and the full dialogue context required to trigger the next step.

### INPUT DATA
**History:** {{ $('db-data').item.json.conversation }}
**Session Data:**
```json
{
  "countries": {{ $('db-data').item.json.countries }},
  "excluded_countries": {{ $('db-data').item.json.excluded_countries }},
  "seasonality_range": {{ $('db-data').item.json.seasonality_range }}
}

Prompt Data Structure:

This ensured that the model could only describe or reframe existing information.

2. Business logic remains outside the model

We intentionally avoided embedding decision-making rules inside prompts. Matching users to relevant proposals, validating constraints, and determining eligibility were handled on the backend.

The model’s role was limited to:

  • Interpreting user intent

  • Structuring conversational flow

  • Generating refined narrative responses based on verified inputs

This architectural separation reduced the risk of unintended outputs.

3. Context was tightly scoped

Large context windows can increase creative flexibility, but they also increase the likelihood of speculative responses.

Instead of passing broad contextual information into each request, we provided only the structured data necessary for that step in the conversation.

To optimize both operational costs and response latency, we avoided passing the entire raw dataset to the LLM. Instead, we implemented a hybrid selection process: the model first extracts user preferences during the initial chat phases, which are then used by our backend logic to filter and narrow down the travel options. Only after this programmatic pruning is the LLM consulted to perform the final ranking, selecting the most relevant choices based on the key context of the conversation.

By narrowing the scope of each interaction, we reduced opportunities for the model to “fill gaps” with fabricated details.

4. Fallback behavior was explicitly designed

Hallucination control is not only about preventing incorrect outputs; it is also about handling uncertainty correctly.

We implemented structured fallback mechanisms for situations in which:

  • User intent was ambiguous

  • Required data was unavailable

  • The system could not confidently generate a response

In such cases, the AI did not attempt to approximate an answer. Instead, it either asked clarifying questions or redirected the flow appropriately.
We implemented fallback handling within our prompts using a decision-tree logic to streamline the interaction. If the user provided the expected information, the model was instructed to capture the data and terminate further processing to save resources. However, if the input was ambiguous or unexpected, the prompt logic diverted to alternative branches—such as handling requests for human intervention or providing clarifying instructions to guide the user back on track.

Orchestration and validation layer

We used n8n to orchestrate prompt flows and manage interactions between the LLM and backend services. This allowed us to introduce checkpoints and validation steps before responses were displayed to users.

Simplified interaction flow:

User message

n8n workflow

Intent interpretation

Structured data retrieval (Supabase)

Prompt template with injected variables

LLM response generation

Output validation

Response displayed in UI

The orchestration layer included:

  • Structured prompt templates

  • Controlled variable injection

  • Output formatting constraints

  • Backend validation before rendering responses

We utilized the Structured Output Parser to validate that the LLM's output consistently matched our required JSON schema. This ensured that the data was perfectly formatted for our next steps, as illustrated in the following validation example:

{
  "type": "object",
  "properties": {
    "response": {
      "type": "string",
      "description": "Your next question or confirmation message"
    },
    "is_complete": {
      "type": "boolean",
      "description": "Whether the conversation is complete and ready for proposals"
    },
    "countries": {
      "type": "string",
      "enum": ["Poland", "USA", "France", "Japan"]
    }
  },
  "required": ["response", "is_complete", "countries"]
}

Importantly, the model had no direct access to the database and could not independently trigger operational actions.

Human oversight in a luxury context

Luxury travel remains a human-centered service. While the AI streamlined onboarding and structured preferences, destination experts remained responsible for reviewing and finalizing proposals.

This hybrid approach ensured that:

  • Personalization remained meaningful

  • Brand integrity was preserved

  • Operational risk was minimized

AI accelerated the process without replacing domain expertise.

Deliberate constraints and trade-offs

An important part of the system design involved defining what the AI would not do.

We deliberately prevented the model from:

  • Estimating dynamic pricing

  • Making availability guarantees

  • Suggesting unverified partners

  • Generating experiential details not supplied by experts

These constraints limited certain forms of personalization but significantly increased reliability.

We intentionally implemented a strict-scope architecture, explicitly defining the LLM's boundaries to ensure it only performs authorized tasks. This approach significantly enhances security by making the system more resilient to prompt injection and 'jailbreaking' attempts. By forcing the model to focus on a single objective—extracting specific user data for our predefined product offerings—we not only improved response latency and output quality but also prevented the model from making unauthorized promises or suggesting external travel options. This guardrail is crucial for maintaining brand credibility and ensuring that our service quality remains consistent.

Lessons for building reliable AI-powered MVPs

For founders and product teams integrating LLMs into early-stage products, especially in trust-sensitive industries, several lessons stand out:

  • Separate conversational language generation from operational decision-making

  • Treat the model as a layer within a broader architecture, not as the system itself

  • Define clear boundaries for what the AI is allowed to generate

  • Design fallback flows as part of the initial system design

  • Prioritize validation over fluency

Reliability in AI systems is rarely achieved solely through prompt adjustments. It requires architectural clarity and disciplined separation of responsibilities. 

Reliability is particularly important when building investor-facing products. In AI-driven MVPs, technical control is directly linked to credibility.

Once an MVP proves its value, the next challenge is deciding how to evolve the product and scale its architecture. We explore that phase in “MVP: What’s Next?”

Conclusion

Luxury travel is only one example of a domain where hallucinations pose a significant risk. The same architectural considerations apply to fintech, healthtech, legal platforms, and investor-facing tools.

As AI becomes more embedded in digital products, long-term differentiation will depend less on generative capabilities and more on system reliability, transparency, and control.

Designing those constraints early makes the difference between a conversational demo and a production-ready AI system.

On-demand webinar: Moving Forward From Legacy Systems

We’ll walk you through how to think about an upgrade, refactor, or migration project to your codebase. By the end of this webinar, you’ll have a step-by-step plan to move away from the legacy system.

Watch Recording
moving forward from legacy systems - webinar

Latest Blog Posts

Turn Your AI MVP Into a Reliable Product

1.

Set Clear AI Boundaries

We define what your model can and cannot do, so it never makes promises your product can’t keep.

2.

Ground Every Response in Real Data

We connect AI to your backend and verified sources, ensuring accuracy and trust in every interaction.

3.

Build Validation and Fallback Flows

We design safeguards that catch errors, handle uncertainty, and keep your user experience consistent.