The Definitive Guide to Local Ollama Context
Автор: Kryptstar Mack
Загружено: 2025-09-16
Просмотров: 156
In this video, I'll show you how to dramatically increase the context window of your local Ollama models! By default, Ollama is limited to 248 tokens (around 1,600 words), but many LLMs support up to 128K tokens (96,000 words or ~200 pages!). I'll walk you through the steps to modify Quen 3 8B to utilize a 128K context window, even on an RTX 5090 (though this works with other GPUs too!).
We'll cover:
Converting the model file
Editing the parameters (context length, GPU layers)
Creating a new LLM based on the modified file
This allows you to process much larger documents, codebases, or conversations, unlocking the full potential of your local LLMs. I'll also show you how I use AI to manage and summarize my notes!
#Ollama #LLM #LocalAI #AI #ContextWindow #MachineLearning #AIModel #Quen3 #Gemma #ContextLength #AIHacks #DIYAI #OpenSourceAI #AIForEveryone #AICommunity #CyberSecurity #AItools
Доступные форматы для скачивания:
Скачать видео mp4
-
Информация по загрузке: