Have you been struggling to implement enterprise-ready AI? There’s a secret you need to know in order to get there.

If you want to implement enterprise-ready AI, you must learn the secret behind why today’s models are both more powerful yet hallucinate much, much more.

As The New York Times reported last week, OpenAI and Google “don’t know why” their latest models perform remarkably better on math and science benchmarks while simultaneously hallucinate a lot more often.

This article explains exactly why this happening. Most importantly, it also explains exactly how to fix it.

Essential AI Vocabulary

ChatGPT captured the world’s attention in November 2022—two and a half years ago. It’s success introduced a new vocabulary: generative AI, hallucinations, Retrieval Augmented Generation (RAG), etc.

However, the technology has been growing so fast that we still need new words to conceptualize it. For example, from current vocabulary it’s hard to understand how “better” AI can have more “hallucinations”. But once we introduce some new vocabulary, the mystery of why “better” AI hallucinates more actually becomes self evident.

The tremendous issue that’s perplexing OpenAI, Google, and others becomes self-apparent the moment you understand three categories of generative AI:

  • Recombinant AI
  • Recall AI
  • Reasoning AI

The moment you understand the three types of generative AI, you will be ready to implement 100% accurate, enterprise-ready chatbots. I promise.

Executive Summary

In short, there are three types of generative AI. Each has its own purpose, and each needs to be trained differently.

  • Recombinant AI: Involves generatively mixing learnings into new patterns. This can range from ideation, essay writing, coding, and even recombinant summaries.
  • Recall AI: This involves providing accurate answers that are extracted from one or more learned sources (i.e. answering questions through recall).
  • Reasoning AI: This involves making accurate deductions.

ChatGPT launched as a recombinant AI service. It wasn’t intended to be used for factual recall. In fact, the way that it was trained literally makes 100% accurate recall impossible. (More on this below.)

What more enterprises are looking for is a different type of generative AI—Recall AI. However, OpenAI and others leap frogged over training Recall AI in order to pursue Reasoning AI instead.

Training Recall AI requires a precise set of criteria. The farther an LLM’s training diverges from this criteria, the more hallucinations the LLM will produce. And that’s precisely what has been happening. The newer models have a greater divergence from this training criteria; therefore, they hallucinate much more often.

This fact bears repeating:

Training Recall AI requires a precise set of criteria. The farther an LLM’s training diverges from this criteria, the more hallucinations the LLM will produce. And that’s precisely what has been happening. The newer models have a greater divergence from this training criteria; therefore, they hallucinate much more often.

Fortunately, once the criteria are adhered to, 100% accurate Recall AI results—providing enterprises the AI they have been looking for all along.

ChatGPT: Dawn of Recombinant LLMs

Ilya Sutskever, cofounder of OpenAI, was surprised by ChatGPT’s success.

Sutskever thought people would find ChatGPT “boring” because it was designed to be recombinant (not useful for accurate recall). As Sutskever stated:

When you asked it a factual question, it gave you a wrong answer. I thought it was going to be so unimpressive that people would say, ‘Why are you doing this? This is so boring!’ — Ilya Sutskever regarding the launch of ChatGPT

Ilya Sutskever (OpenAI Image Generator Cartoon Depiction)

If you have been struggling to implement enterprise AI, it’s important to know the following: ChatGPT was not originally intended to be used for providing factual answers to questions. That was not the original purpose of its release. It was intended to offer recombinant processing of source materials—not accurate recall of them.

Nevertheless, ChatGPT revolutionized AI. After all, there is tremendous value to Recombinant AI for ideation, coding, and more. Recombinant AI is also spectacular for image and video generation as well. All of these areas have been truly transformed by Recombinant AI.

However, Recombinant AI must be trained stochastically in order to fulfill its purpose. In other words, it must be trained to be probabilistic (the antithesis of deterministic).

Hallucinations Are Inherent In Recombinant AI

Recall AI must be deterministic. Every question must be answered directly from the provided sources without deviation.

The stochastic training used in recombinant AI produces degrees of deviation. Such deviation inherently results in hallucinations. In fact, you can think of hallucinations as deviation errors.

Square Peg Meet Round Hole

OpenAI and other LLM makers quickly realized the mass interest in using chatbots for Question and Answering (Q/A). Therefore, they sought to add this capability to their models. The problem is that these LLM makers tried to force stochastic recombinant AI to perform the deterministic recall task.

Where deviation errors do not occur, the results of the models are stunning. However, when deviation errors occur, the models’ results can be utterly ridiculous. That’s why ChatGPT can be both stunning and ridiculous at the same time.

At first, LLM makers kept trying to force the recombinant architectures to produce accurate recall. But they hit an inevitable ceiling as demonstrated by the disappointing release of ChatGPT 4.5.

In view of GPT-4.5’s dismal performance, OpenAI officially declared that GPT-4.5 is the last of its recombinant models. OpenAI is now exclusively pursuing reasoning models instead.

o1: Dawn of Reasoning LLMs

OpenAI has officially shifted to focusing on reasoning models. Reasoning models are designed for deduction. Training Reasoning AI involves techniques like:

  • Chain-of-thought (CoT) prompting or training.
  • Intermediate step supervision (e.g., supervising intermediate thoughts, not just final answers).
  • Private chain of thought (as in o3): the model reasons internally before generating an answer.
  • Enhanced tool useplanning modules, or scratchpads for intermediate computation.

While such techniques do indeed improve deductionthese techniques cause training to further diverge away from the criteria needed to train Recall AI. This increased divergence causes increased hallucinations. This is the answer to the tremendous issue that is currently perplexing OpenAI, Google, and other LLM makers.

This bears repeating:

Such techniques do indeed improve deduction. However, these techniques cause training to further diverge away from the criteria needed to train Recall AI. This increased divergence causes increased hallucinations.

Yes, recall hallucinations are indeed “worse than ever.” However, now that we know the cause, we also know the solution.

BSD: Dawn of Recall LLMs

I work at a company called Acurai Inc. Acurai has taken the road less travelled. At Acurai, we focus on the boring side of AI—100% accurate recall.

I have received permission from Acurai’s CEO (Adam Forbes) to publish every detail of our company’s proprietary Bounded-Scope Deterministic (BSD) Models—the first models in the category of Recall AI.

You can read the entire series here: “100% Accurate AI Step-by-Step (Part One): BSD Neural Networks.

In short, BSD introduces deterministic training to natural language models, thereby producing 100% consistent results. Everything is disclosed in the series linked above.

Why RAG Fails

Retrieval Augmented Generation (RAG) is the most popular approach to addressing hallucinations. However, it routinely fails to eliminate hallucinations.

On the surface, the intuition for RAG seems sound: If I send the facts to the LLM then it cannot hallucinate when providing the answer.

So why does the LLM still hallucinate? The answer is that you are sending the facts to a recombinant processor that inherently deviates from the provided facts. Such deviation is inherent in the stochastic training.

This is why LLMs fail to produce accurate answers even when you provide the answers using RAG.

Build Enterprise-Ready AI… Today

With BSD, enterprise AI is finally available now. Recall AI is what enterprises have been looking for.

If you want to know every step in building 100% accurate Recall AI, I encourage you to read the entire series. Enterprise-ready AI is already here. You just have to know where to look. 🙂