Популярное

Музыка Кино и Анимация Автомобили Животные Спорт Путешествия Игры Юмор

Интересные видео

2025 Сериалы Трейлеры Новости Как сделать Видеоуроки Diy своими руками

Топ запросов

смотреть а4 schoolboy runaway турецкий сериал смотреть мультфильмы эдисон
dTub
Скачать

Neural Networks Are Elastic Origami! [Prof. Randall Balestriero]

Автор: Machine Learning Street Talk

Загружено: 2025-02-08

Просмотров: 14944

Описание:

Professor Randall Balestriero joins us to discuss neural network geometry, spline theory, and emerging phenomena in deep learning, based on research presented at ICML. Topics include the delayed emergence of adversarial robustness in neural networks ("grokking"), geometric interpretations of neural networks via spline theory, and challenges in reconstruction learning. We also cover geometric analysis of Large Language Models (LLMs) for toxicity detection and the relationship between intrinsic dimensionality and model control in RLHF.

SPONSOR MESSAGES:
***
CentML offers competitive pricing for GenAI model deployment, with flexible options to suit a wide range of models, from small to large-scale deployments.
https://centml.ai/pricing/

Tufa AI Labs is a brand new research lab in Zurich started by Benjamin Crouzier focussed on o-series style reasoning and AGI. Are you interested in working on reasoning, or getting involved in their events?

Goto https://tufalabs.ai/
***

Show notes and transcript: https://www.dropbox.com/scl/fi/3lufge...

TOC:
[00:00:00] Introduction

1. Neural Network Geometry and Spline Theory
[00:01:41] 1.1 Neural Network Geometry and Spline Theory
[00:07:41] 1.2 Deep Networks Always Grok
[00:11:39] 1.3 Grokking and Adversarial Robustness
[00:16:09] 1.4 Double Descent and Catastrophic Forgetting

2. Reconstruction Learning
[00:18:49] 2.1 Reconstruction Learning
[00:24:15] 2.2 Frequency Bias in Neural Networks

3. Geometric Analysis of Neural Networks
[00:29:02] 3.1 Geometric Analysis of Neural Networks
[00:34:41] 3.2 Adversarial Examples and Region Concentration

4. LLM Safety and Geometric Analysis
[00:40:05] 4.1 LLM Safety and Geometric Analysis
[00:46:11] 4.2 Toxicity Detection in LLMs
[00:52:24] 4.3 Intrinsic Dimensionality and Model Control
[00:58:07] 4.4 RLHF and High-Dimensional Spaces

5. Conclusion
[01:02:13] 5.1 Neural Tangent Kernel
[01:08:07] 5.2 Conclusion

REFS:
[00:01:35] Balestriero/Humayun – Deep network geometry & input space partitioning
https://arxiv.org/html/2408.04809v1

[00:03:55] Balestriero & Paris – Linking deep networks to adaptive spline operators
https://proceedings.mlr.press/v80/bal...

[00:13:55] Song et al. – Gradient-based white-box adversarial attacks
https://arxiv.org/abs/2012.14965

[00:16:05] Humayun, Balestriero & Baraniuk – Grokking phenomenon & emergent robustness
https://arxiv.org/abs/2402.15555

[00:18:25] Humayun – Training dynamics & double descent via linear region evolution
https://arxiv.org/abs/2310.12977

[00:20:15] Balestriero – Power diagram partitions in DNN decision boundaries
https://arxiv.org/abs/1905.08443

[00:23:00] Frankle & Carbin – Lottery Ticket Hypothesis for network pruning
https://arxiv.org/abs/1803.03635

[00:24:00] Belkin et al. – Double descent phenomenon in modern ML
https://arxiv.org/abs/1812.11118

[00:25:55] Balestriero et al. – Batch normalization’s regularization effects
https://arxiv.org/pdf/2209.14778

[00:29:35] EU – EU AI Act 2024 with compute restrictions
https://www.lw.com/admin/upload/SiteA...

[00:39:30] Humayun, Balestriero & Baraniuk – SplineCam: Visualizing deep network geometry
https://openaccess.thecvf.com/content...

[00:40:40] Carlini – Trade-offs between adversarial robustness and accuracy
https://arxiv.org/abs/1902.06705

[00:44:55] Balestriero & LeCun – Limitations of reconstruction-based learning methods
https://raw.githubusercontent.com/mlr...

[00:47:20] Balestriero & LeCun – Spectral analysis of neural network learning
https://proceedings.neurips.cc/paper_...

[00:49:45] He et al. – MAE: Masked Autoencoders for self-supervised learning
https://arxiv.org/abs/2111.06377

[00:54:50] Balestriero et al. – Geometric analysis of LLM layers for toxicity detection
https://arxiv.org/abs/2309.12312

[00:59:35] Balestriero et al. – Superior toxicity detection via geometric features
https://arxiv.org/html/2312.01648v2

[01:04:45] UofT ML – Self-attention control & context length effects
https://arxiv.org/abs/2310.04444

[01:11:55] Roberts – Foundations of deep learning theory
https://arxiv.org/abs/2106.10165

[01:15:40] Balestriero & Cha – Kolmogorov GAM Networks via spline partition theory
https://arxiv.org/pdf/2501.00704

[01:16:40] Various – Graph Kolmogorov-Arnold Networks (GKAN) extension
https://www.nature.com/articles/s4159...

Neural Networks Are Elastic Origami! [Prof. Randall Balestriero]

Поделиться в:

Доступные форматы для скачивания:

Скачать видео mp4

  • Информация по загрузке:

Скачать аудио mp3

Похожие видео

NEURAL NETWORKS ARE WEIRD! - Neel Nanda (DeepMind)

NEURAL NETWORKS ARE WEIRD! - Neel Nanda (DeepMind)

Abstraction & Idealization: AI's Plato Problem [Mazviita Chirimuuta]

Abstraction & Idealization: AI's Plato Problem [Mazviita Chirimuuta]

The Real Reason Huge AI Models Actually Work [Prof. Andrew Wilson]

The Real Reason Huge AI Models Actually Work [Prof. Andrew Wilson]

Mechanistic Interpretability explained | Chris Olah and Lex Fridman

Mechanistic Interpretability explained | Chris Olah and Lex Fridman

BREAKING NEWS: Elon Musk Holds Surprise Talk At The World Economic Forum In Davos

BREAKING NEWS: Elon Musk Holds Surprise Talk At The World Economic Forum In Davos

Управление поведением LLM без тонкой настройки

Управление поведением LLM без тонкой настройки

Как внимание стало настолько эффективным [GQA/MLA/DSA]

Как внимание стало настолько эффективным [GQA/MLA/DSA]

This is why Deep Learning is really weird.

This is why Deep Learning is really weird.

Neural and Non-Neural AI, Reasoning, Transformers, and LSTMs

Neural and Non-Neural AI, Reasoning, Transformers, and LSTMs

Learning Algorithm Of Biological Networks

Learning Algorithm Of Biological Networks

Google DeepMind CEO Demis Hassabis: The Path To AGI, Deceptive AIs, Building a Virtual Cell

Google DeepMind CEO Demis Hassabis: The Path To AGI, Deceptive AIs, Building a Virtual Cell

If You Can't See Inside, How Do You Know It's THINKING? [Dr. Jeff Beck]

If You Can't See Inside, How Do You Know It's THINKING? [Dr. Jeff Beck]

WE MUST ADD STRUCTURE TO DEEP LEARNING BECAUSE...

WE MUST ADD STRUCTURE TO DEEP LEARNING BECAUSE...

Why Deep Learning Works Unreasonably Well [How Models Learn Part 3]

Why Deep Learning Works Unreasonably Well [How Models Learn Part 3]

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Lecture 2 | The Universal Approximation Theorem

Lecture 2 | The Universal Approximation Theorem

Как происходит модернизация остаточных соединений [mHC]

Как происходит модернизация остаточных соединений [mHC]

Generative Model That Won 2024 Nobel Prize

Generative Model That Won 2024 Nobel Prize

Don't invent faster horses - Prof. Jeff Clune

Don't invent faster horses - Prof. Jeff Clune

Mathematics: The rise of the machines - Yang-Hui He

Mathematics: The rise of the machines - Yang-Hui He

© 2025 dtub. Все права защищены.



  • Контакты
  • О нас
  • Политика конфиденциальности



Контакты для правообладателей: infodtube@gmail.com