Context Engineering

Context Window Budget

Every model has a finite context window measured in tokens, and treating that space as a budget is essential for effective agent design. You must allocate tokens across system instructions, conversation history, retrieved context, tool definitions, and the model's own reasoning and output, since exceeding the window causes silent truncation or errors while wasting tokens on irrelevant information degrades performance even within the limit. Research on the "lost in the middle" problem shows that models disproportionately attend to information at the beginning and end of the context, making strategic placement of critical information as important as total quantity.

subtopics

Input vs Output Tokens

Context Prioritization

connected to

Token Economics Context Density Memory Management

resources

Anthropic: Context Window Sizesdocs.anthropic.comCurrent context window sizes for all Claude models (docs.anthropic.com)Lost in the Middlearxiv.orgResearch showing how models struggle with information placed in the middle of long contexts (arxiv.org)OpenAI: Managing Tokensplatform.openai.comPractical advice for counting and managing tokens in production (platform.openai.com)Counting Tokens with Tiktokencookbook.openai.comOpenAI Cookbook guide to counting tokens accurately before making API calls (cookbook.openai.com)Anthropic: Prompt Cachingdocs.anthropic.comHow prompt caching can reduce repeated context costs by up to 90% (docs.anthropic.com)

view in track