The Secret Sauce Behind GPT4 and ChatGPT, Fully Explained
Автор: AI with Alex
Загружено: 3 апр. 2023 г.
Просмотров: 3 920 просмотров
We explore how the GPT (Generative, Pretrained Transformer) model works, under the hood, and why it's so effective at generating realistic text most relevant to your prompt. We dive into how a Transformer works, what the Attention Mechanism is, and optimization steps the architecture takes to scale its training on large datasets.
0:00 Intro
1:29 What's a Transformer?
1:55 GPT
2:35 How GPT is trained
4:08 Attention Mechanism: The Secret Behind GPT
10:15 Scaling Transformers and GPT
11:35 GPT and AGI
*note I use “character” and “word” when I mean “token” for explainability
*when I say GPT4 has 100 trillion parameters, I mean 1 trillion. In addition, the AGI part of the video is just speculative and me giving my opinion
References:
Andrej Karpathy's video:
• Let's build GPT: from scratch, in cod...
Note: This video is like an abridged version of the technical side of Karpathy's 2 hour long tutorial
My GPT Colab Notebook:
https://colab.research.google.com/dri...
GPT4:
https://cdn.openai.com/papers/gpt-4.pdf
Training GPT:
https://openai.com/blog/chatgpt
MIT's explanation of RNN & Transformers:
• MIT 6.S191 (2023): Recurrent Neural N...
Attention is all you Need:
https://arxiv.org/pdf/1706.03762.pdf
Lex Fridman Podcast clip:
• Sam Altman: OpenAI CEO on GPT-4, Chat...
GPT AGI:
https://arxiv.org/pdf/2303.12712.pdf

Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: