Which method to choose for a chat bot that answers questions about a specific service?

Let’s imagine that there is Bank X. I want to create a chatbot that should answer user questions about this bank. How to get a loan, what are the conditions, why is my card blocked, and so on.
What’s the best approach?

  1. Prepare a dataset in advance with detailed information about this bank x. Split documents, tokenize each part and store it in a vector database. After a request from the user, I use semantic search to find the most suitable document, then extract and generate a response from the document using language models (for example Mistral or Llama)

  2. From a pre-prepared dataset, generate a new dataset that will consist of a question and an answer. For example, for document 1, generate 5 questions and answers that refer to this document and create a dataset in the format context(document) : question : answer
    Then take Mistral and train this model based on this dataset and create a model that will subsequently generate answers to user questions

Please give me advice on which option is the most optimal in terms of costs, accuracy, response speed, scalability, and so on

1 Like

hi there,
According to your proposed approaches, i assume that your data is fixed and you just need a chatbot to “answer” user questions by extracting information from your predefined data set.
So, your first approach would be good for a quick start with low cost and ok-ish answer. If you’re not confident enough to spend much money on the chatbot, this approach would fit
About the second approach, it would cost more on initial investment but would be better in accuracy.

Response time and scalability is another story to discuss when you have your belief on investing more for the chatbot

1 Like

For a Bank X chatbot, the best default is option 1, retrieval augmented generation, then only fine tune if you have a clear gap.

Option 1, RAG on bank docs
Cost: low to medium. You pay embeddings plus LLM calls, but no training loop.
Accuracy: highest for policy and product questions because answers stay tied to source text.
Speed: usually fast enough if you keep chunks small and cache top results.
Scalability: strong. Update docs, re embed, no retraining needed.
Risk: you must do good chunking and citation style prompting, otherwise it can still hallucinate.

Option 2, generate Q A and fine tune
Cost: higher. You pay for generating Q A, cleaning it, training, and re training when policies change.
Accuracy: can be good for tone and standard flows, but risky for factual policy details unless the dataset is constantly updated.
Speed: can be faster at runtime because no retrieval step, but accuracy drift is the real problem.
Scalability: weaker for changing info because updates require another training cycle.

Best practical setup in banking
Start with RAG as the source of truth.
Add a small fine tune only for style, intent routing, and common workflows like card blocked, loan eligibility steps, escalation.
Use a fallback rule: if retrieval confidence is low, ask a clarifying question or route to a human.

If you share two things, I can recommend the exact configuration
How often do bank policies change
Do you need the bot to quote sources or just answer conversationally

1 Like