BLOGS
Nov 22, 2024

When did RAG stop being about traffic lights? An AI explainer

What is RAG AI, and why should you care? It combines the strengths of different AI innovations to make them much more useful: more reliable and more efficient. Find out why.

When did RAG stop being about traffic lights? An AI explainer

RAG AI, or retrieval augmented generation AI, is like many technological advances we have quickly accepted – like using a search engine to find information or booking a ride through Uber. Just because we can’t immediately see the complexity of the algorithms behind it, doesn’t mean it’s not a big jump forward. It’s easy – too easy – to overlook the nuances.

In the AI innovation world, it’s much easier to get excited about things like robots. Better yet: agents* – because you’re picturing James Bond and because the tech is more analogous to a job we kind of understand. Although, arguably, robots and agents both take us back toward the ‘AI is sinister AF’ PR image problem.

We’ll get on to agents next week. First, I’m going to try to make you care about RAG, despite it having a name that is neither cool nor immediately helpful, especially if you’ve been red/amber/greening your risk register for your whole career.

I acknowledge this is like asking my cat to care about our Hermes – sorry, Evri – driver: Christmas (the cat) is not interested. From the cat’s point of view it’s just a bit of magic. She brings the kibble, but he isn’t interested in the complex manufacturing and distribution network in the background that was years in development.

How to handle a foul-mouthed gift

Imagine you have a very clever parrot that can repeat anything you say and easily fools you into thinking he knows what he’s saying. We will call him Claud. He’s going to be our analogy for a large language model (LLM), which can generate text based on the data it has been trained on.

The parrot is impressive, but sometimes it gets things wrong or misses the context. Think of him as your chatty friend who knows a lot but occasionally gets their facts mixed up.

Now imagine the parrot moves on from living at your Nan’s house, picking up her bad habits and unleashing profanities at strangers. You get him a skilled trainer who teaches the parrot to say useful phrases. This trainer is like a generative pre-trained transformer (GPT), fine-tuning the LLM to produce relevant and coherent responses.

The trainer helps the parrot become more accurate and effective. Imagine the trainer as your friend who gently corrects the parrot.

But even the cleverest parrot and the best trainer can make mistakes. They get carried away. Enter RAG in the form of the wiser, more experienced avian sensei, an African Grey who oversees the whole process and makes sure everything runs smoothly. This parrot is our RAG AI, which brings out the best of the LLM and GPT, but with some much-needed oversight. RAG brings some much-needed management skills.

Our RAG sensei – who I will call Sandi Toksvig – is still pretty relaxed about colourful language as long as the audience is, but stops the excessive time-wasting and makes sure that Claud plays the role he was born to play. Everyone is still having fun, but the guardrails save Claud and his trainer a lot of time and effort because they are more focused on the job in hand. And Claud's audience is better served.

How does it actually work?

That’s enough parrot analogies for now; we’re adults. Here’s what’s happening:

RAG is an AI framework that enhances the capabilities of large language models(LLMs) by incorporating defined knowledge sources. The process has four core technical components:

  1. Embedding model: Converts documents into vector representations for efficient management.
  2. Retriever: Acts as a search engine, using the embedding model to fetch relevant document vectors based on the query.
  3. Re-ranker (optional): Evaluates retrieved documents for relevance, assigning scores to each.
  4. Language model: Generates a response using the top retrieved documents and the original query.

It operates in two main phases:

  1. Retrieval: Algorithms search for and retrieve relevant information snippets based on the user's prompt.
  2. Generation: The LLM uses the augmented prompt (original query + retrieved information) to synthesise a response.

This approach allows LLMs to access up-to-date information beyond their training data, improving accuracy and reducing hallucinations. It also enables citation of sources, building transparency and trust in the model's outputs.

That’s why RAG AI improves the usefulness of LLMs and GPTs – tools like Chat GPT – by having a mechanism that pulls in the most relevant information from curated sources, rather than just manufacturing a plausible answer from a big pot of word soup.

The RAG can then create a powerful (and trustworthy, and efficient) AI system:

  • LLM (the parrot): Generates text based on training data, but sometimes misses the mark.
  • GPT (the trainer): Fine-tunes the LLM to produce coherent responses.
  • RAG (the wise old bird): Makes sure you only retrieve accurate information so that what you get is correct and contextually appropriate. Like a librarian, stopping you wandering about and getting distracted by comics and finding you the reference you actually needed, from a reliable source.

Why this matters

If you want to use AI for things that you need to rely on, the benefits of RAG AI are immense. We build RAG AI solutions that help social workers navigate guidance and regulations, help companies analyse data, help college staff follow correct procedures… None of those people need a tool that might make stuff up.

RAG AI can also help reduce the computational resources needed for generating accurate responses – right-sizing your solution to your problem and limiting environmental impact. But crucially, by retrieving relevant information from a database, the system can avoid generating unnecessary or incorrect data, saving your precious time and energy.

RAG AI fundamentally improves the usefulness of LLMs and GPTs. It means you can give the AI a specific, useful job to do, moving on from ‘make stuff’ to ‘make this kind of stuff, using these resources.’

RAG in practice

RAG AI has potential applications across almost every industry. In customer service, it provides accurate responses at scale, improving satisfaction and efficiency. Writers and marketers can generate high-quality, relevant content, while researchers and analysts can extract valuable insights from large datasets.

We’re all learning as we go, but RAG assistants like the engines we build for customers aren’t prototypes any more – they’re rapidly becoming everyday tools. You can always play with our schools engine for free to get an idea, or watch some of our free demos, then drop us a line.

*Agents will be the topic next week but in the over-worked parrot metaphor they’re probably the ringmaster: they coordinate and manage the different components to achieve a cohesive performance.

Engine builds private AI tools that keep things simple and honest; private GPT strained on content you curate and your brand guidelines, providing verified answers. Email us at info@engine-ai.co.uk.

Explore our collection of 200+ Premium Webflow Templates