Build vs. Buy AI for banks: 3 hidden risks of in-house LLMs projects in payments (and what to do)
Thought Leadership
Across the banking industry, as in many other sectors, the ambition is clear: "adopt AI." Innovation teams are spinning up impressive Proofs of Concept (PoCs), and internal product teams naturally say, "We could build something for this ourselves."
On the surface, a standardised payment process involving complex documents and rules, such as payment network compliance, looks like a perfect starting point for an AI project.
The ambition is absolutely understandable, but this particular use case is more complex and high-risk than it first appears. Such a niche, standard but high-stakes domain is a minefield for LLMs. As research by the likes of MIT and reported by Forbes, many AI projects get stuck in "PoC testing”. They look good on a slide deck, but never survive the operational reality.
Before an organisation commits nine months and several engineers and experts to an internal PoC, it’s worth looking at the three critical risks we see in in-house AI projects that involve niche standardised processes.. In this article, we use the case of payment network compliance to elaborate on those challenges and what should be done about them.
Risk 1: The "Garbage in, garbage out" data-quality trap
The first, and most significant, hurdle is the data itself. A generic AI model is only as good as the data you feed it. In the case of payment network compliance, that data is a chaotic mix of PDFs, technical letters, and bulletins.
While modern AI is great at parsing text, it struggles when it encounters complex tables. Think of Visa's tables or Mastercard's technical specifications, full of merged cells, footnotes, and conditional ticks. A generic LLM-based PDF-to-text parser will fail to correctly parse such structures and, as our own experience has shown in the past, it will even hallucinate; if the table has a complex structure, it is not uncommon for an LLM today to place ticks in the wrong cells of that table!
This leads to early-stage PoCs where the accuracy of the responses lies between 70% and 80%. And while an 80% accuracy rate sounds fantastic at first, in the world of compliance, a 20% error rate isn't a minor flaw; it's a systemic failure. One will get answers that look correct but are fundamentally wrong, and there is no way of knowing until a compliance auditor finds it.
A purpose-built AI solution, like Kajo, tackles this before the AI is even involved. A team of domain experts (in this case, scheme compliance managers) manually parses, checks, and structures these complex documents into a clean, "LLM-compatible" format. The result is consistently high-quality, curated data for the model. Achieving the same level of liability with generic, off-the-shelf parsers alone is extremely difficult and currently requires significant additional effort.
Risk 2: The "Document temple" vs. the "Process-aware" expert
Let's assume you solve the data quality problem. The next failure point is context.
Many generic, in-house AI tools end up as what we call a "document temple": you upload a bulletin and "chat" with it. That can be useful, but it's very different from how compliance actually works in day-to-day operations.
The tool doesn't know who you are (Issuer, Acquirer, or both). It doesn't know your specific impact categories, your list of stakeholders, or the status of your open tasks related to that bulletin. A product owner asking, "What does this mandate mean for me?" will get the same generic answer as an IT engineer responsible for updating the authorisation system. This lack of context significantly undermines operational efficiency.
A deeply integrated AI solution, like Kajo Intelligence, is process-aware. It knows who you are when you ask the question. It knows your role and configuration. If a bulletin has different effective dates for issuers and acquirers or different regions, its answer will be tailored to you.
It can even connect to your compliance workflow, allowing you to ask, "What are my most critical open tasks related to this mandate?" A generic tool typically cannot do this without deep integration into your systems and /or knowledge of your processes.
In our experience, the strongest results come when internal teams bring their process knowledge and systems access, and specialised tools bring the domain models and workflows to match.
Risk 3: The hidden iceberg: The true cost of validation
This is the hidden cost that kills internal projects: validation. How do you know the AI's answers are correct, complete, and safe to act on?
You need to build an expert-grade validation dataset. At one European card issuer, an internal AI project involved a dedicated team of 30 call-centre agents spending an hour every day just writing and rating questions and answers to evaluate the model's quality.
Now, look at a typical 4 or 5-person scheme compliance team. They are already operating at full capacity, managing their daily workload. Realistically, how much time can they dedicate to becoming part-time AI trainers and data curators?
This is the core value of specialised solutions, as that's what they do. They have that team of experts and resources, as this is their core business, and they have to do it right. Domain experts, in this case, people who live and breathe payment network rules, are the ones building the validation sets, judging the answers, and fine-tuning the models continuously. Validation is never “a complete task”; it has to evolve with every AI model change and new rules and mandates..
The 'outlook' analogy
Your internal team is brilliant, but their time is best spent on your bank's unique value proposition. A helpful thought experiment: Would you ask your team to build a custom AI chatbot for Microsoft Outlook from scratch? Would you be able to outperform the provider of the software when it comes to quality and features, and, most importantly, are you willing to spend the effort, time and money to maintain the solution in the future?
Most banks wouldn’t, because it's not their core business. The same logic applies to highly standardised, regulation-driven areas like payment network compliance and other payment operations, like dispute management. You can make a much better use of your engineers’ time if it is spent building AI systems that differentiate your core value proposition in the market.
The temptation to build is strong, and in some cases, it may absolutely make sense. But in cases like payment network compliance or other standardised processes, the risk of delayed value, budget overrun, and non-compliance exposure is materially higher than it first appears.
Every organisation has to invest in AI to future-proof their business, but when it comes to the build vs buy decision, it’s important to spend time and budget on initiatives that differentiate themselves from the competition. For standardised processes where there is no competitive advantage, partnering with companies specialised in building purpose-built AI products for those niche, standardised domains can be a faster path to adopting AI and bringing efficiency to your team.
Kajo is our purpose-built AI solution for payment network compliance, built and validated by payment experts and designed to plug into your existing operations. In many organisations, Kajo sits alongside internal initiatives: the bank focuses its build efforts on truly differentiating use cases, while Kajo handles the heavy, standardised compliance work.