Superintelligent Agents Pose Catastrophic Risks — ... | Richard M. Karp Distinguished Lecture

Автор: Simons Institute for the Theory of Computing

Загружено: 2025-04-17

Просмотров: 8860

Описание:

Yoshua Bengio (IVADO - Mila - Université de Montréal)
https://simons.berkeley.edu/talks/yos...
Safety-Guaranteed LLMs

The leading AI companies are increasingly focused on building generalist AI agents — systems that can autonomously plan, act, and pursue goals across almost all tasks that humans can perform. Despite how useful these systems might be, unchecked AI agency poses significant risks to public safety and security, ranging from misuse by malicious actors to a potentially irreversible loss of human control. In this talk, Yoshua Bengio will discuss how these risks arise from current AI training methods.

Indeed, various scenarios and experiments have demonstrated the possibility of AI agents engaging in deception or pursuing goals that were not specified by human operators and that conflict with human interests, such as self-preservation. Following the precautionary principle, Bengio and his colleagues see a strong need for safer, yet still useful, alternatives to the current agency-driven trajectory. Accordingly, they propose as a core building block for further advances the development of a non-agentic AI system that is trustworthy and safe by design, which they call Scientist AI. This system is designed to explain the world from observations, as opposed to taking actions in it to imitate or please humans. It comprises a world model that generates theories to explain data and a question-answering inference machine. Both components operate with an explicit notion of uncertainty to mitigate the risks of overconfident predictions.

In light of these considerations, a Scientist AI could be used to assist human researchers in accelerating scientific progress, including in AI safety. In particular, this system could be employed as a guardrail against AI agents that might be created despite the risks involved. Ultimately, focusing on non-agentic AI may enable the benefits of AI innovation while avoiding the risks associated with the current trajectory. Bengio and his colleagues hope these arguments will motivate researchers, developers, and policymakers to favor this safer path.

Yoshua Bengio is a full professor in the Department of Computer Science and Operations Research at Université de Montréal, as well as the founder and scientific director of Mila and the scientific director of IVADO. He also holds a Canada CIFAR AI chair. Considered one of the world’s leaders in artificial intelligence and deep learning, he is the recipient of the 2018 A.M. Turing Award, considered the “Nobel Prize of computing.”

He is a fellow of both the U.K.’s Royal Society and the Royal Society of Canada, an officer of the Order of Canada, a knight of the Legion of Honor of France, and a member of the U.N.’s Scientific Advisory Board for Independent Advice on Breakthroughs in Science and Technology.

Superintelligent Agents Pose Catastrophic Risks — ... | Richard M. Karp Distinguished Lecture

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Yoshua Bengio - AI Catastrophic Risks & Scientist AI Solution [Alignment Workshop]

Yoshua Bengio - AI Catastrophic Risks & Scientist AI Solution [Alignment Workshop]

Future Directions In AI Safety Research

Future Directions In AI Safety Research

Why particles might not exist | Sabine Hossenfelder, Hilary Lawson, Tim Maudlin

Why particles might not exist | Sabine Hossenfelder, Hilary Lawson, Tim Maudlin

Register Tiling for Unstructured Sparsity in Neural Network Inference

Register Tiling for Unstructured Sparsity in Neural Network Inference

AI at a Defining Moment: Ensuring Safety Through Technical & Societal Safeguards with Yoshua Bengio

AI at a Defining Moment: Ensuring Safety Through Technical & Societal Safeguards with Yoshua Bengio

Formal Reasoning Meets LLMs: Toward AI for Mathematics and Verification

Formal Reasoning Meets LLMs: Toward AI for Mathematics and Verification

Почему теория струн — это не настоящая физика | Роджер Пенроуз, Брайан Грин и Эрик Вайнштейн

Почему теория струн — это не настоящая физика | Роджер Пенроуз, Брайан Грин и Эрик Вайнштейн

Andrej Karpathy: Software Is Changing (Again)

Andrej Karpathy: Software Is Changing (Again)

Richard Sutton: The OaK Architecture – A Vision of Superintelligence from Experience | AGI-25

Richard Sutton: The OaK Architecture – A Vision of Superintelligence from Experience | AGI-25

The Parallel Batch-Dynamic Model with Asynchronous Reads

The Parallel Batch-Dynamic Model with Asynchronous Reads

Yann LeCun | Self-Supervised Learning, JEPA, World Models, and the future of AI

Yann LeCun | Self-Supervised Learning, JEPA, World Models, and the future of AI

Richard Sutton – Father of RL thinks LLMs are a dead end

Richard Sutton – Father of RL thinks LLMs are a dead end

Roger Penrose – Why Intelligence Is Not a Computational Process: Breakthrough Discuss 2025

Roger Penrose – Why Intelligence Is Not a Computational Process: Breakthrough Discuss 2025

The Catastrophic Risks of AI — and a Safer Path | Yoshua Bengio | TED

The Catastrophic Risks of AI — and a Safer Path | Yoshua Bengio | TED

Josh Tenenbaum - Scaling Intelligence the Human Way - IPAM at UCLA

Josh Tenenbaum - Scaling Intelligence the Human Way - IPAM at UCLA

The Era of Experience & The Age of Design: Richard S. Sutton, Upper Bound 2025

The Era of Experience & The Age of Design: Richard S. Sutton, Upper Bound 2025

MIGHT THE ROBOTS TAKE OVER? [Prof. Yoshua Bengio]

MIGHT THE ROBOTS TAKE OVER? [Prof. Yoshua Bengio]

Stanford CS230 | Autumn 2025 | Lecture 1: Introduction to Deep Learning

Stanford CS230 | Autumn 2025 | Lecture 1: Introduction to Deep Learning

«Мы изменимся настолько, что будем другим вариантом Homo sapiens» — психолог Александр Асмолов

«Мы изменимся настолько, что будем другим вариантом Homo sapiens» — психолог Александр Асмолов

Sir Demis Hassabis on The Future of Knowledge | Institute for Advanced Study

Sir Demis Hassabis on The Future of Knowledge | Institute for Advanced Study