What are AI guardrails? How do they work?
Автор: Probably Private
Загружено: 2025-07-28
Просмотров: 676
In this video, you'll investigate different approaches to AI guardrails and look at if they address problems of privacy in machine learning/AI. I'm curious to hear what you liked and learned and how you approach threat analysis/modeling and testing in your AI products/systems. Let me know in the comments!
To learn more, feel free to check out the articles and series on memorization in AI models.
Software-based guardrails: https://blog.kjamistan.com/blocking-a...
External algorithmic and internal alignment (i.e. training) guardrails:
Read all articles in the series: https://blog.kjamistan.com/a-deep-div...
And some citations from the video:
Zhang's presentation on avoiding software-based guardrails: • Quantifying and Understanding Memorization...
Purple Llama: https://github.com/meta-llama/PurpleL...
Nemo Guardrails: https://github.com/NVIDIA/NeMo-Guardr...
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: