RAG Explained (Retrieval Augmented Generation) | Gen AI / LLMs | Tech in Two Ep1

Автор: HandyAndy Tech Tips

Загружено: 2025-09-17

Просмотров: 373

Описание:

In this video, I'll explain how RAG (Retrieval Augmented Generation) works in the context of Gen AI - in only two minutes! You'll learn how RAG allows information to be retrieved from accurate source documents and then ‘injected’ into the generated response, improving accuracy.

--
If you enjoyed this video, please SUBSCRIBE to HandyAndy Tech Tips!

--
If you want more in-depth information, here are the sources I used for this video:

https://cloud.google.com/discover/wha...

https://arstechnica.com/information-t...

https://www.ai-bites.net/retrieval-au...

https://learn.microsoft.com/en-us/azu...

/ introduction-to-rag-retrieval-augmented-ge...

https://weaviate.io/blog/vector-embed...

--
Image sources:
Rag on grass - https://picryl.com/media/rags-cloth-r...

Library - Roman Eisele, CC BY-SA 4.0, via Wikimedia Commons. https://upload.wikimedia.org/wikipedi...

Vector embedding diagram - https://weaviate.io/blog/vector-embed...

--
OK, so what is a RAG? It stands for Retrieval Augmented Generation, and it’s a technique used in conjunction with LLMs, or large language models.

Basically, the problem with current LLMs is that they can hallucinate, or make things up. Why does this happen? Well, they’re trained on a set of data, like a bunch of websites or e-books, and then they find patterns in this data which allow them to create new text. But if you ask them about something that didn’t appear very often, if at all, in their training dataset – like a recent news story, or specific information about your company – they won’t be able to give a good answer.

This is where RAG comes in. It allows information to be retrieved from accurate source documents and then ‘injected’ into the generated response, improving accuracy.

This is how it works. It starts with a corpus, or collection, of documents. These documents are divided into different sections called chunks, so that the LLM can process them. Then vector embeddings are generated based on the chunks. A vector embedding is a numeric representation of the text, which can be used to work out the relationships between words, and also to calculate how similar it is to other text. These embeddings are stored in a special kind of database called a vector database.

Now, what happens when you ask a question of the LLM? Well, your text query is also converted into a vector embedding, and then the RAG system searches the vector database to find the chunks that are the most similar to the query, and fetches the top results. This is the ‘retrieval’ part of RAG. The most relevant chunks are then added to your query as additional context, in order to ‘augment’ it. And then the model will ‘generate’ an answer that includes information from the chunks. An additional benefit of this approach is that, unlike the training data of the model, which is often a bit of a black box, systems using RAG can actually provide citations directly to the source documents that they use.

So that’s how retrieval augmented generation can improve the accuracy of AI models.

RAG Explained (Retrieval Augmented Generation) | Gen AI / LLMs | Tech in Two Ep1

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

How Docling turns documents into usable AI data

How Docling turns documents into usable AI data

Краткое объяснение больших языковых моделей

Краткое объяснение больших языковых моделей

США требуют от Зеленского уступки. Протесты бизнесменов в Иркутске. Рэп-альбом про «СВО» за 5 млн

США требуют от Зеленского уступки. Протесты бизнесменов в Иркутске. Рэп-альбом про «СВО» за 5 млн

Don't Use Al to Paraphrase Until You Watch This

Don't Use Al to Paraphrase Until You Watch This

Anthropic только что выпустила Claude Opus 4.5 — и теперь он доступен разработчикам

Anthropic только что выпустила Claude Opus 4.5 — и теперь он доступен разработчикам

Introduction To Undertsanding RAG(Retrieval-Augmented Generation)

Introduction To Undertsanding RAG(Retrieval-Augmented Generation)

5 Unbelievably Useful AI Tools For Research in 2025 (better than ChatGPT)

5 Unbelievably Useful AI Tools For Research in 2025 (better than ChatGPT)

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

LangChain Explained in 10 Minutes (Components Breakdown + Build Your First AI Chatbot)

LangChain Explained in 10 Minutes (Components Breakdown + Build Your First AI Chatbot)

Retrieval-augmented generation (RAG), Clearly Explained (Why it Matters)

Retrieval-augmented generation (RAG), Clearly Explained (Why it Matters)

Алексей Венедиктов. Есть ли перспектива мира? И зачем тогда переговоры? Казус Долиной

Алексей Венедиктов. Есть ли перспектива мира? И зачем тогда переговоры? Казус Долиной

RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models

RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models

Ex-OpenAI Scientist WARNS:

Ex-OpenAI Scientist WARNS: "You Have No Idea What's Coming"

7 AI Terms You Need to Know: Agents, RAG, ASI & More

7 AI Terms You Need to Know: Agents, RAG, ASI & More

How to Build Your Own Music Library by Ripping CDs (Windows 11, Android)

How to Build Your Own Music Library by Ripping CDs (Windows 11, Android)

Как работает кодировка Base64? | Объяснение кодировки вложений электронных писем | Технологии в д...

Как работает кодировка Base64? | Объяснение кодировки вложений электронных писем | Технологии в д...

What is Retrieval-Augmented Generation (RAG)?

What is Retrieval-Augmented Generation (RAG)?

What is a Vector Database? Powering Semantic Search & AI Applications

What is a Vector Database? Powering Semantic Search & AI Applications

От нуля до вашего первого ИИ-агента за 25 минут (без кодирования)

От нуля до вашего первого ИИ-агента за 25 минут (без кодирования)

Neural networks

Neural networks