Популярное

Музыка Кино и Анимация Автомобили Животные Спорт Путешествия Игры Юмор

Интересные видео

2025 Сериалы Трейлеры Новости Как сделать Видеоуроки Diy своими руками

Топ запросов

смотреть а4 schoolboy runaway турецкий сериал смотреть мультфильмы эдисон
dTub
Скачать

Efficiently Modeling Long Sequences with Structured State Spaces - Albert Gu | Stanford MLSys #46

Автор: Stanford MLSys Seminars

Загружено: 2021-11-11

Просмотров: 21788

Описание:

Episode 46 of the Stanford MLSys Seminar Series!

Efficiently Modeling Long Sequences with Structured State Spaces
Speaker: Albert Gu

Abstract:
A central goal of sequence modeling in machine learning is designing a single principled model that can address sequence data across a range of modalities and tasks, particularly on long-range dependencies. Although conventional models including RNNs, CNNs, and Transformers have specialized variants for capturing long dependencies, they still struggle to scale to very long sequences of 10000 or more steps. We introduce a simple sequence model based on the fundamental state space representation $x'(t) = Ax(t) + Bu(t), y(t) = Cx(t) + Du(t)$ and show that it combines the strengths of several model families. Furthermore, we show that the HiPPO theory of continuous-time memorization can be incorporated into the state matrix $A$, producing a class of structured models that handles long-range dependencies mathematically and can be computed very efficiently. The Structured State Space (S3) model achieves strong empirical results across a diverse range of established benchmarks, including (i) 91% accuracy on sequential CIFAR-10 with no data augmentation or auxiliary losses, on par with a larger 2-D ResNet, (ii) substantially closing the gap to Transformers on image and language modeling tasks, while performing generation 60X faster, (iii) SotA on every task from the Long Range Arena benchmark, including solving the challenging Path-X task of length 16k that all prior work fails on, while being as efficient as all competitors.

Bio:
Albert Gu is a PhD student in the Stanford CS department, advised by Chris Ré. His research interests include algorithms for structured linear algebra and theoretical principles of deep sequence models.

--

0:00 Presentation
26:53 Discussion

Stanford MLSys Seminar hosts: Dan Fu, Karan Goel, Fiodar Kazhamiaka, and Piero Molino
Executive Producers: Matei Zaharia, Chris Ré

Twitter:
  / realdanfu​  
  / krandiash​  
  / w4nderlus7  

Intro music:
Heading Home by Nekzlo @nekzlo
Music provided by Free Music for Vlogs    • (Free Music for Vlogs) Nekzlo - Heading Home  

--

Check out our website for the schedule: http://mlsys.stanford.edu
Join our mailing list to get weekly updates: https://groups.google.com/forum/#!for...

#machinelearning #ai #artificialintelligence #systems #mlsys #computerscience #stanford #longsequences #hippo #s4 #structuredstatespaces

Efficiently Modeling Long Sequences with Structured State Spaces - Albert Gu | Stanford MLSys #46

Поделиться в:

Доступные форматы для скачивания:

Скачать видео mp4

  • Информация по загрузке:

Скачать аудио mp3

Похожие видео

Data Science for Infrastructure w Pixie CEO Zain Asgar | Stanford MLSys #47

Data Science for Infrastructure w Pixie CEO Zain Asgar | Stanford MLSys #47

Hardware-aware Algorithms for Sequence Modeling - Tri Dao | Stanford MLSys #87

Hardware-aware Algorithms for Sequence Modeling - Tri Dao | Stanford MLSys #87

Простые шаги для нахождения объема цилиндра с использованием тройного интегрирования и сферически...

Простые шаги для нахождения объема цилиндра с использованием тройного интегрирования и сферически...

Notes on AI Hardware - Benjamin Spector | Stanford MLSys #88

Notes on AI Hardware - Benjamin Spector | Stanford MLSys #88

Causal Mechanistic Interpretability (Stanford lecture 1) - Atticus Geiger

Causal Mechanistic Interpretability (Stanford lecture 1) - Atticus Geiger

Statistical Mechanics Lecture 1

Statistical Mechanics Lecture 1

FERRAN ŁAMIE KOD, A YAMAL GASI ŚWIATŁO! CZY ONI JESZCZE KIEDYŚ PRZEGRAJĄ? | SKRÓT

FERRAN ŁAMIE KOD, A YAMAL GASI ŚWIATŁO! CZY ONI JESZCZE KIEDYŚ PRZEGRAJĄ? | SKRÓT

Monarch Mixer: Making Foundation Models More Efficient - Dan Fu | Stanford MLSys #86

Monarch Mixer: Making Foundation Models More Efficient - Dan Fu | Stanford MLSys #86

Stanford AI Club: Jeff Dean on Important AI Trends

Stanford AI Club: Jeff Dean on Important AI Trends

Text2SQL: The Dream versus Reality - Laurel Orr | Stanford MLSys #89

Text2SQL: The Dream versus Reality - Laurel Orr | Stanford MLSys #89

Lecture 1 | String Theory and M-Theory

Lecture 1 | String Theory and M-Theory

Как тонкая настройка программ LLM с открытым исходным кодом решает проблему внедрения GenAI в про...

Как тонкая настройка программ LLM с открытым исходным кодом решает проблему внедрения GenAI в про...

Poetiq - Ian Fischer (CEO) | Stanford Hidden Layer Podcast #104

Poetiq - Ian Fischer (CEO) | Stanford Hidden Layer Podcast #104

Следующий 100x — Гэвин Уберти | Stanford MLSys #92

Следующий 100x — Гэвин Уберти | Stanford MLSys #92

EVO: DNA Foundation Models - Eric Nguyen | Stanford MLSys #96

EVO: DNA Foundation Models - Eric Nguyen | Stanford MLSys #96

Stanford CME295 Transformers & LLMs | Autumn 2025 | Lecture 1 - Transformer

Stanford CME295 Transformers & LLMs | Autumn 2025 | Lecture 1 - Transformer

Large Language Models for Program Optimization - Osbert Bastani | Stanford MLSys #91

Large Language Models for Program Optimization - Osbert Bastani | Stanford MLSys #91

Stanford CS230 | Autumn 2025 | Lecture 9: Career Advice in AI

Stanford CS230 | Autumn 2025 | Lecture 9: Career Advice in AI

Lecture 1 | The Theoretical Minimum

Lecture 1 | The Theoretical Minimum

Scaling Up “Vibe Checks” for LLMs - Shreya Shankar | Stanford MLSys #97

Scaling Up “Vibe Checks” for LLMs - Shreya Shankar | Stanford MLSys #97

© 2025 dtub. Все права защищены.



  • Контакты
  • О нас
  • Политика конфиденциальности



Контакты для правообладателей: infodtube@gmail.com