Building LLM Attribution Classifier: ChatGPT/Claude/Gemini (GPU Setup, LLM2Vec, LoRA)

Автор: syed amaan

Загружено: 2025-12-13

Просмотров: 69

Описание:

In this video I walk through implementing an LLM attribution classifier that can detect which AI model (ChatGPT, Claude, Gemini, etc.) generated a piece of text. The approach is based on the recent "Idiosyncrasies in Large Language Models" paper and involves converting Llama 3 into an encoder using LLM2Vec, then training a classifier head with LoRA.

I start from scratch with setting up a GPU instance on RunPod, go through the theory of how LLM2Vec converts decoder-only models to encoders through bidirectional attention and masked next-token prediction, implement the training loop with LoRA for memory efficiency, and debug various issues that come up with multi-GPU setups and quantization. I show both a fast implementation using premerged model weights and a low-bandwidth version that assembles everything from base components.

The video is long and fairly detailed. I tried to make it useful for anyone wanting to understand model attribution at a deeper level or reproduce this kind of work themselves. Hopefully the timestamps below help you jump to relevant sections.

Timestamps:
0:00:00 - intro
0:01:00 - demo of the end product we are going to build in this video
0:02:02 - idiosyncrasies in large language models research paper and repo
0:05:00 - kinds of idiosyncratic patterns, referencing blogpost on syedamaan.com
0:08:50 - original inference repo at https://github.com/syedamaann/llm-idi...
0:09:50 - getting gpu instances from runpod, lambda labs
0:10:13 - gpu setup on runpod, crash course on gpu architectures (ampere, hopper, blackwell)
0:14:36 - ssh into instance + inspect using htop, nvidia-smi, nvtop + ssh with cursor/vscode
0:20:45 - install dependencies and a word of caution to avoid dependency issues
0:26:00 - using persistent volume mounts (at /workspace) with runpod instances
0:27:58 - inspecting yida/classifier_chat on huggingface
0:30:24 - inspecting bf16 vs fp32 in model architecture with low rank adaptation adapters
0:34:51 - [important segment] building the main classifier ie load_classifier()
0:51:00 - quick run through of what's done so far, concise recap 1
0:55:25 - instance hung! diagnosing and fixing (culprit: memory issues and cache)
1:10:02 - diving deep into LLM2Vec (bidirectional attention, masked next token prediction, contrastive learning)
1:32:36 - plugging you to read my blog on syedamaan.com to help build intuition, hah!
1:33:16 - continuing LLM2Vec deep dive (connecting the dots all the way till "mean pooling")
1:37:13 - walking through complete classification architecture (all the way till attaching linear classification head and softmax function probabilities)
1:41:10 - low rank adaptation (LoRA) adapters with research paper and math intuition building
1:47:32 - live inspecting a llama model with LoRA adapters inside attention block
1:50:10 - summarising LoRA and sharing some thoughts on wrestling with ideas/concepts
1:52:50 - quick run through of what's done so far, concise recap 2
1:54:45 - building inference function ie predict_text()
1:59:17 - gradio ui
2:00:35 - testing the final product, yay!
2:03:18 - short thoughts on implications, evolving nature of LLM idiosyncrasies, and future need
2:06:13 - we are not done yet, time to implement this the hard way in low bandwidth mode
2:20:40 - outro :)

Links:
Research paper: https://eric-mingjie.github.io/llm-id...
arXiv: https://arxiv.org/abs/2502.12150
My blog posts on this
https://www.syedamaan.com/writing/llm...
https://www.syedamaan.com/writing/llm...
https://www.syedamaan.com/writing/llm...
Code: https://github.com/syedamaann/llm-idi...

feel free to text me at https://x.com/syedamaann

Building LLM Attribution Classifier: ChatGPT/Claude/Gemini (GPU Setup, LLM2Vec, LoRA)

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Как подключить свои документы к LLM — полный разбор RAG

Как подключить свои документы к LLM — полный разбор RAG

Почему нет массовых профессиональных ИИ, как они будут развиваться и что станет итогом.

Почему нет массовых профессиональных ИИ, как они будут развиваться и что станет итогом.

Удар по Су-27 в Крыму, Переговоры в Майами, Клинтон в файлах Эпштейна. Фейгин, Левиев,Давлетгильдеев

Удар по Су-27 в Крыму, Переговоры в Майами, Клинтон в файлах Эпштейна. Фейгин, Левиев,Давлетгильдеев

Путин использует коррупцию для разрушения страны @theinsiderlive

Путин использует коррупцию для разрушения страны @theinsiderlive

LLM fine-tuning или ОБУЧЕНИЕ малой модели? Мы проверили!

LLM fine-tuning или ОБУЧЕНИЕ малой модели? Мы проверили!

Gemini 3, кванты и плоть. Странное будущее искусственного интеллекта.

Gemini 3, кванты и плоть. Странное будущее искусственного интеллекта.

Как быстро собирать embedded-код и заливать его на любую dev-плату • C • Live coding

Как быстро собирать embedded-код и заливать его на любую dev-плату • C • Live coding

ДНК создал Бог? Самые свежие научные данные о строении. Как работает информация для жизни организмов

ДНК создал Бог? Самые свежие научные данные о строении. Как работает информация для жизни организмов

Firecrawl + MCP-сервер в n8n: Забудь про сложный парсинг и скрапинг! Идеальный AI агент

Firecrawl + MCP-сервер в n8n: Забудь про сложный парсинг и скрапинг! Идеальный AI агент

Cursor AI: полный гайд по вайб-кодингу (настройки, фишки, rules, MCP)

Cursor AI: полный гайд по вайб-кодингу (настройки, фишки, rules, MCP)

Музыка для работы за компьютером | Фоновая музыка для концентрации и продуктивности

Музыка для работы за компьютером | Фоновая музыка для концентрации и продуктивности

NotebookLM тихо обновился. Как делать Инфографику, Презентации, Видеопересказ.

NotebookLM тихо обновился. Как делать Инфографику, Презентации, Видеопересказ.

Подробное объяснение тонкой настройки LoRA и QLoRA

Подробное объяснение тонкой настройки LoRA и QLoRA

Рост цен, повышение НДС, ставка ЦБ: как защитить деньги от инфляции и заработать на рынке

Рост цен, повышение НДС, ставка ЦБ: как защитить деньги от инфляции и заработать на рынке

Применение AI и LLM в разработке и управлении | Александр Лукьянченко. AvitoTechConf 2025

Применение AI и LLM в разработке и управлении | Александр Лукьянченко. AvitoTechConf 2025

Управление поведением LLM без тонкой настройки

Управление поведением LLM без тонкой настройки

Чат ПГТ 5.2 - это похоронная. Самый УЖАСНЫЙ релиз в истории ИИ

Чат ПГТ 5.2 - это похоронная. Самый УЖАСНЫЙ релиз в истории ИИ

4 часа Шопена для обучения, концентрации и релаксации

4 часа Шопена для обучения, концентрации и релаксации

Интернет в небе: Сергей

Интернет в небе: Сергей "Флеш" о том, как «Шахеды» и «Герберы» научились работать в одной связке

Но что такое нейронная сеть? | Глава 1. Глубокое обучение

Но что такое нейронная сеть? | Глава 1. Глубокое обучение