Intuition Behind Self-Attention Mechanism in Transformer Networks

Автор: Ark (ark)

Загружено: 2020-10-16

Просмотров: 218446

Описание:

This is the first part of the Transformer Series. Here, I present an intuitive understanding of the self-attention mechanism in transformer networks.

[Paper] Attention Is All You Need: https://papers.nips.cc/paper/7181-att...

Other Resources:
Video Lecture on Word2Vec: • Lecture 2 | Word Vector Representations: w...
Great article on Word2Vec: https://jalammar.github.io/illustrate...

Intuition Behind Self-Attention Mechanism in Transformer Networks

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

array(10) { [0]=> object(stdClass)#5764 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "KJtZARuO3JY" ["related_video_title"]=> string(70) "Visualizing transformers and attention | Talk for TNG Big Tech Day '24" ["posted_time"]=> string(27) "6 месяцев назад" ["channelName"]=> string(15) "Grant Sanderson" } [1]=> object(stdClass)#5737 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "OxCpWwDCDFQ" ["related_video_title"]=> string(48) "The Attention Mechanism in Large Language Models" ["posted_time"]=> string(19) "1 год назад" ["channelName"]=> string(15) "Serrano.Academy" } [2]=> object(stdClass)#5762 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "eMlx5fFNoYc" ["related_video_title"]=> string(130) "Визуализация внимания, сердце трансформера | Глава 6, Глубокое обучение" ["posted_time"]=> string(19) "1 год назад" ["channelName"]=> string(11) "3Blue1Brown" } [3]=> object(stdClass)#5769 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "KMHkbXzHn7s" ["related_video_title"]=> string(57) "How Attention Mechanism Works in Transformer Architecture" ["posted_time"]=> string(25) "3 месяца назад" ["channelName"]=> string(14) "Under The Hood" } [4]=> object(stdClass)#5748 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "bCz4OMemCcA" ["related_video_title"]=> string(100) "Attention is all you need (Transformer) - Model explanation (including math), Inference and Training" ["posted_time"]=> string(21) "2 года назад" ["channelName"]=> string(10) "Umar Jamil" } [5]=> object(stdClass)#5766 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "IHZwWFHWa-w" ["related_video_title"]=> string(131) "Градиентный спуск, как обучаются нейросети | Глава 2, Глубинное обучение" ["posted_time"]=> string(19) "7 лет назад" ["channelName"]=> string(11) "3Blue1Brown" } [6]=> object(stdClass)#5761 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "1g-9DocNcTQ" ["related_video_title"]=> string(148) "Новый фронт Третьей Мировой открыт | Яков Кедми, Каринэ Геворгян и Руслан Сафаров" ["posted_time"]=> string(23) "8 часов назад" ["channelName"]=> string(24) "Геополитбюро" } [7]=> object(stdClass)#5771 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "UPtG_38Oq8o" ["related_video_title"]=> string(61) "The math behind Attention: Keys, Queries, and Values matrices" ["posted_time"]=> string(19) "1 год назад" ["channelName"]=> string(15) "Serrano.Academy" } [8]=> object(stdClass)#5747 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "wjZofJX0v4M" ["related_video_title"]=> string(148) "LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры" ["posted_time"]=> string(19) "1 год назад" ["channelName"]=> string(11) "3Blue1Brown" } [9]=> object(stdClass)#5765 (5) { ["video_id"]=> int(9999999) ["related_video_id"]=> string(11) "vaQZ_teebqE" ["related_video_title"]=> string(120) "Израиль трещит по швам — ответ Ирана уже начался? | Скотт Риттер" ["posted_time"]=> string(21) "4 часа назад" ["channelName"]=> string(21) "Мысли Вслух" } }

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

The Attention Mechanism in Large Language Models

The Attention Mechanism in Large Language Models

Визуализация внимания, сердце трансформера | Глава 6, Глубокое обучение

Визуализация внимания, сердце трансформера | Глава 6, Глубокое обучение

How Attention Mechanism Works in Transformer Architecture

How Attention Mechanism Works in Transformer Architecture

Attention is all you need (Transformer) - Model explanation (including math), Inference and Training

Attention is all you need (Transformer) - Model explanation (including math), Inference and Training

Градиентный спуск, как обучаются нейросети | Глава 2, Глубинное обучение

Градиентный спуск, как обучаются нейросети | Глава 2, Глубинное обучение

Новый фронт Третьей Мировой открыт | Яков Кедми, Каринэ Геворгян и Руслан Сафаров

Новый фронт Третьей Мировой открыт | Яков Кедми, Каринэ Геворгян и Руслан Сафаров

The math behind Attention: Keys, Queries, and Values matrices

The math behind Attention: Keys, Queries, and Values matrices

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

LLM и GPT - как работают большие языковые модели? Визуальное введение в трансформеры

Израиль трещит по швам — ответ Ирана уже начался? | Скотт Риттер

Израиль трещит по швам — ответ Ирана уже начался? | Скотт Риттер