Frequently Asked Questions

Find answers to common questions about AskBnB's Retrieval Augmented Generation (RAG) platform to transform knowledge about your short-term rental into an AI chat experience with ChatGPT who helps your guests with instant answers about your home, rules, checkout, neighborhood, and more.

Can I choose which LLM I want to use on the AskBnB platform?

Yes, in Settings > Preferences, you can set any LLM model you would like. The default is “Auto” where AskBnB makes the selection for you, but you can change the default to your preferred model which includes the latest from ChatGPT, Gemini, Grok and more. New models are added as they get released.

How does the Google Drive & Microsoft OneDrive integration work?

This integration makes it easy to keep files in your Google Drive or MS OneDrive auto-synced with collections on the AskBnB platform so any change to a document is auto-synced every 24-hrs.

Auto-Sync: While choosing a file from Google Drive or MS Ondrive, be sure to click the “Auto-sync” checkbox before clicking the “Vectorize” button. AskBnB will monitor the synced file for changes then automatically update your AskBnB collection every 24 hours. This ensures your AI always has the latest information and you never have to manually upload the same document again.

What is Retrieval-Augmented Generation (RAG)?

Overview

Retrieval-Augmented Generation (RAG) is a technique that enhances large language models (LLMs) by allowing them to retrieve and use information that was not included in their training data. While an LLM’s knowledge is fixed at its training cutoff, RAG enables access to personal, proprietary, or up-to-date data at query time.

How RAG Works

RAG processes your documents by converting them into tokens and vector embeddings—numerical representations that capture semantic meaning. When a user asks a question, the system retrieves the most relevant document chunks and supplies them to the LLM as context before generating a response.

When a RAG Platform Is Useful

A dedicated RAG platform is especially valuable when you:

Need the AI model to know information it was not trained on
Work with large documents or many documents that exceed the model’s context window
Require long-term memory over large knowledge bases
Use file formats not natively supported by LLMs (e.g., Google Docs, MS Office)
Want to expose your domain knowledge through a queryable AI interface

Why RAG Is Powerful

RAG excels at precision by:

Performing semantic similarity searches across vectorized documents
Identifying the most relevant content in high-dimensional vector space
Supplying only the most pertinent information to the LLM
Reducing hallucinations by grounding responses in YOUR data

The Result

RAG enables highly targeted, accurate responses—even across massive document collections. Well-defined and specific questions produce the best results, as they more effectively map to the most relevant vectors containing the answer.

How secure is my data?

AskBnB utilizes the highest security standards. SOC 2 Type II compliant; HIPAA compliant; GDPR compliant

End-to-End Encryption: Any data in motion is encrypted end-to-end
Google Cloud Storage: For data at rest, volumes are stored on Google Cloud Storage using Google's security protocols
Data Privacy: We do not sell your data or use your data for any purpose other than providing the AskBnB service. We do not train our own models on your data..

Which file formats are supported by AskBnB?

AskBnB has special handling for the broadest range of file format support from a RAG platform in the marketplace. This includes, but not limited to, Google files (Docs, Sheets, Slides), MS Office files (.docx, .xlsx, .pptx), images, video files, audio files, CSV, multi-tab XLSX files, PDF, markdown, and more.

Is there a limit to the amount of information I can have in a collection?

Will guests experience hallucinations when chatting with AI about my short-term rental?

NO. Agentic safeguards are put in place to ensure no hallucinations occur. If AI cannot find the answer in the provided context from your collection, rather than hallucinate an answer, it will respond with "The answer is not contained in the provided context."

How is ChatGPT or Gemini using my data on the AskBnB platform different than the ChatGPT or Gemini I use directly already?

Same Models, Different Knowledge Sources

They are the same models but use different data sets for its knowledge. The "regular" LLM like ChatGPT that you use directly is using public internet data it has been trained on for its knowledge base. The AskBnB platform uses the data you vectorized to a collection on the AskBnB platform to augment its knowledge with YOUR data. In this case, the LLM MUST ground its answer in the source data you provided.

What Makes this Unique

This is what makes models like ChatGPT your own personal LLM that is an expert on anything and everything you want it to know about your short-term rental and neighborhood.
When you ask the LLM a question about any data you vectorized to your "AI library" on the AskBnB platform, the LLM does not attempt to fetch the answer from its generalized pretraining data set. Instead, it fetches the answer from the source material you vectorized to a collection on the AskBnB platform. If the answer does not exist in your "AI library", it will say so instead of hallucinating an answer
The real power of the AskBnB platform is in the vector embeddings, vector db, and retrieval techniques that can fetch only the relevant parts of any document to the LLM as context to answer the query. This is the "magic" of a RAG system - to retrieve only the relevant context AI needs to respond to your query, even if that information is buried across hundreds to thousands of documents in a collection.

Platform Capabilities

The AskBnB platform enables you to:

Curate your "AI library" from any source in almost any format across hundreds to thousands of documents, videos, audio files, web urls, social media posts, images, etc. that you (or anyone else) can then use a leading LLM to query that data and performs jobs
When AI answers your query, links are provided to the source material where you can see the plain text file itself that was parsed from the original format or URL if you sourced the information from the web or social. This feature is turned OFF for any collection you share to a dedicated URL, so guests cannot access the source material in your collection.

Is AskBnB available as an app in the Apple App Store?

No. AskBnB is browser-based and mobile-friendly so when guests scan your QR code, they do not have to download an app or register. They can simply start speaking immediately to get answers.

What makes "pre-processing" and "post-processing" the most critical element to a RAG platform that works as intended?

The Secret Sauce of High-Performance RAG

Pre-processing and post-processing are the secret sauce of a high-performing RAG system.

Pre-processing Elements

Before we vectorize your data, we clean it up and structure it in a way that maximizes retrieval accuracy. This involves things like:

Context Continuity: The process of breaking down documents into chunks that become their own vector embedding can leave large context continuity holes without sophisticated techniques and agentic workflows to ensure these chunks maintain context continuity across vectors. This also means special handling for various data sources (e.g. chunking a video transcript is different from chunking a spreadsheet)
Structure Retention: We preserve the original document's structure (headings, lists, tables) in a format the LLM can understand (Markdown). This is crucial for retrieval understanding the relationships between different parts of a document.
Metadata Addition: We add metadata (like the source URL, document type, date, etc.) to each chunk. This allows for powerful filtering during retrieval.

Post-processing Techniques

After we retrieve the most relevant chunks, we further refine them before sending them to the LLM. This includes:

Re-ranking: We use additional algorithms to re-order the retrieved chunks based on relevance.
Vector Overlap Removal: We eliminate redundant information between chunks.
Chunk Stitching: We intelligently combine related chunks to provide cohesive context for the LLM.
Contextual Irrelevance QA: We use agentic workflows to identify and discard any retrieved chunks that, despite seeming relevant based on vector similarity, are actually irrelevant to the user's query.
Metadata Injections: We can include relevant metadata in the context provided to the LLM, further improving its understanding.

Why It Matters

Without proper pre- and post-processing, RAG systems often return irrelevant or incomplete results. Our expertise in these areas is what makes AskBnB's retrieval so accurate and reliable.

Some of the information about our short-term rental is in a spreadsheet instead of a document. Can AskBnB handle tables and multi-tab spreadsheets?

AskBnB uses specialized handling for tabular data across multiple tabs.

Structure Preservation: AskBnB retains the table structure with structured markdown prior to the vector embeddings step, so the LLM understands the relationships between rows and columns.
Multi-Tab Support: AskBnB handles multi-tab XLSX files, preserving the structure of each tab. This is crucial for complex spreadsheets where data is organized across multiple sheets.
Querying Tables: You can ask questions that require the LLM to analyze and synthesize information from tables, just like you would with any other data type.

Is my data secure?

AskMyBnB utilizes the highest security standards. SOC 2 Type II compliant; HIPAA compliant; GDPR compliant

End-to-End Encryption: Any data in motion is encrypted end-to-end
Google Cloud Storage: For data at rest, volumes are stored on Google Cloud Storage using Google's security protocols
Data Privacy: AskBnB does not sell your data or use your data for any purpose other than providing the AskBnB service. AskBnB does not train any AI models on your data.
Access Control: There is an optional access code requirement if you prefer guests to enter an access code before chatting with a collection you have shared publicly to a dedicated URL.