Structured Output from LLMs: Grammars, Regex, and State Machines

Автор: Efficient NLP

Загружено: 2024-12-05

Просмотров: 7378

Описание:

Try Voice Writer - speak your thoughts and let AI handle the grammar: https://voicewriter.io

Structured outputs are essential for applications that integrate LLMs to make decisions in downstream tasks. In this video, I explain how structured output generation works - a topic that is very relevant and also an area of active research.

First, we look at OpenAI API's ability to produce structured outputs using formats like Pydantic or Zod. For open-source alternatives, I cover the Outlines library, which operates using state machines and regex under the hood.

However, in many cases, we need to generate outputs according to a context-free grammar (CFG), which introduces the need for pushdown automata. Learn how advanced techniques address the challenges of grammar terminals mismatching with LLM tokenization, why this is a problem, and some creative solutions from recent research papers.

0:00 - Introduction
1:06 - OpenAI API example
3:02 - Outlines library example
4:07 - Pydantic to regex conversion
4:57 - Finite state machines and regex
5:58 - Regex matching with LLMs
8:41 - Context free grammars
9:40 - Incremental parsing of CFGs
11:22 - Pushdown automata
12:18 - Token-terminal mismatch problem
14:26 - Vocabulary-aligned subgrammars
15:12 - State machine composition
16:06 - Format restriction and LLM performance

OpenAI Structured Outputs API: https://platform.openai.com/docs/guid...

Outlines library: https://github.com/dottxt-ai/outlines

References

Willard, Brandon T., and Rémi Louf. "Efficient guided generation for large language models." arXiv preprint arXiv:2307.09702 (2023). https://arxiv.org/abs/2307.09702

Geng, Saibo, et al. "Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning." EMNLP 2023. https://arxiv.org/abs/2305.13971

Beurer-Kellner, Luca, Marc Fischer, and Martin Vechev. "Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation." ICML 2024. https://arxiv.org/abs/2403.06988

Koo, Terry, Frederick Liu, and Luheng He. "Automata-based constraints for language model decoding." COLM 2024. https://arxiv.org/abs/2407.08103

Tam, Zhi Rui, et al. "Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models.“ EMNLP 2024. https://arxiv.org/abs/2408.02442

Structured Output from LLMs: Grammars, Regex, and State Machines

Доступные форматы для скачивания:

Скачать видео mp4

Информация по загрузке:

Скачать аудио mp3

Похожие видео

Speculative Decoding: When Two LLMs are Faster than One

Speculative Decoding: When Two LLMs are Faster than One

Я в опасности

How to Measure LLM Confidence: Logprobs & Structured Output

How to Measure LLM Confidence: Logprobs & Structured Output

Магистратура по речевым технологиям: модели, которые слушают и отвечают

Магистратура по речевым технологиям: модели, которые слушают и отвечают

Поворотные позиционные вложения: сочетание абсолютного и относительного

Поворотные позиционные вложения: сочетание абсолютного и относительного

Программирование на ассемблере без операционной системы

Программирование на ассемблере без операционной системы

Самая сложная модель из тех, что мы реально понимаем

Самая сложная модель из тех, что мы реально понимаем

Управление поведением LLM без тонкой настройки

Управление поведением LLM без тонкой настройки

Residual Vector Quantization for Audio and Speech Embeddings

Residual Vector Quantization for Audio and Speech Embeddings

Greedy? Min-p? Beam Search? How LLMs Actually Pick Words – Decoding Strategies Explained

Greedy? Min-p? Beam Search? How LLMs Actually Pick Words – Decoding Strategies Explained

Он вам не Диод! Таинственный Диод Ганна.

Он вам не Диод! Таинственный Диод Ганна.

The Windows 11 Disaster That's Killing Microsoft

The Windows 11 Disaster That's Killing Microsoft

Exploring GPT-5's Hidden Gems: Freeform Function Calls and Context-Free Grammars

Exploring GPT-5's Hidden Gems: Freeform Function Calls and Context-Free Grammars

Можно ли использовать Whisper для потоковой передачи ASR в реальном времени?

Можно ли использовать Whisper для потоковой передачи ASR в реальном времени?

I Visualised Attention in Transformers

I Visualised Attention in Transformers

Building a 0-shot LLM Classifier using Structured Generation with Outlines

Building a 0-shot LLM Classifier using Structured Generation with Outlines

Почему «Трансформеры» заменяют CNN?

Почему «Трансформеры» заменяют CNN?

Квантование против обрезки против дистилляции: оптимизация нейронных сетей для вывода

Квантование против обрезки против дистилляции: оптимизация нейронных сетей для вывода

RAG vs. CAG: Solving Knowledge Gaps in AI Models

RAG vs. CAG: Solving Knowledge Gaps in AI Models

The KV Cache: Memory Usage in Transformers

The KV Cache: Memory Usage in Transformers