Attention for Neural Networks, Clearly Explained!!!

Josh Starmer

StatQuest

Machine Learning

Statistics

Data Science

Автор: StatQuest with Josh Starmer

Загружено: 5 июн. 2023 г.

Просмотров: 345 823 просмотра

Описание:

Attention is one of the most important concepts behind Transformers and Large Language Models, like ChatGPT. However, it's not that complicated. In this StatQuest, we add Attention to a basic Sequence-to-Sequence (Seq2Seq or Encoder-Decoder) model and walk through how it works and is calculated, one step at a time. BAM!!!

NOTE: This StatQuest is based on two manuscripts. 1) The manuscript that originally introduced Attention to Encoder-Decoder Models: Neural Machine Translation by Jointly Learning to Align and Translate: https://arxiv.org/abs/1409.0473 and 2) The manuscript that first used the Dot-Product similarity for Attention in a similar context: Effective Approaches to Attention-based Neural Machine Translation https://arxiv.org/abs/1508.04025

NOTE: This StatQuest assumes that you are already familiar with basic Encoder-Decoder neural networks. If not, check out the 'Quest:    • Sequence-to-Sequence (seq2seq) Encoder-Dec...

For a complete index of all the StatQuest videos, check out:
https://statquest.org/video-index/

If you'd like to support StatQuest, please consider...

Patreon:   / statquest
...or...
YouTube Membership:    / @statquest

...buying one of my books, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...
https://statquest.org/statquest-store/

...or just donating to StatQuest!
https://www.paypal.me/statquest

Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
  / joshuastarmer

0:00 Awesome song and introduction
3:14 The Main Idea of Attention
5:34 A worked out example of Attention
10:18 The Dot Product Similarity
11:52 Using similarity scores to calculate Attention values
13:27 Using Attention values to predict an output word
14:22 Summary of Attention

#StatQuest #neuralnetwork #attention

Attention for Neural Networks, Clearly Explained!!!

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Transformer Neural Networks, ChatGPT's foundation, Clearly Explained!!!

Transformer Neural Networks, ChatGPT's foundation, Clearly Explained!!!

Sequence-to-Sequence (seq2seq) Encoder-Decoder Neural Networks, Clearly Explained!!!

Sequence-to-Sequence (seq2seq) Encoder-Decoder Neural Networks, Clearly Explained!!!

The math behind Attention: Keys, Queries, and Values matrices

The math behind Attention: Keys, Queries, and Values matrices

How did the Attention Mechanism start an AI frenzy? | LM3

How did the Attention Mechanism start an AI frenzy? | LM3

Recurrent Neural Networks (RNNs), Clearly Explained!!!

Recurrent Neural Networks (RNNs), Clearly Explained!!!

RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models

RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models

Long Short-Term Memory (LSTM), Clearly Explained

Long Short-Term Memory (LSTM), Clearly Explained

Визуализация внимания, сердце трансформера | Глава 6, Глубокое обучение

Визуализация внимания, сердце трансформера | Глава 6, Глубокое обучение

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

RAG vs. CAG: Solving Knowledge Gaps in AI Models

RAG vs. CAG: Solving Knowledge Gaps in AI Models