Wrapping MCPs to beat context pollution | Prague .

Members-Only

Recent Talks & Demos are for members only

Exclusive feed

You must be an AI Tinkerers active member to view these talks and demos.

June 12, 2026 · Prague

Wrapping MCPs for Token Efficiency

Learn how to wrap MCP servers into subagents, drastically cutting token counts for tool descriptions and task output to enable dozens of active MCPs.

Overview
Links
Tech stack
  • PydanticAI
    A Python agent framework for building production-grade, type-safe Generative AI applications with validated, structured outputs.
    PydanticAI is the Python agent framework from the Pydantic team, designed to bring FastAPI's ergonomic, type-safe development experience to Generative AI. It leverages Pydantic’s core data validation to ensure Large Language Model (LLM) outputs conform strictly to defined schemas, eliminating unpredictable text responses. The framework uses 'Agents' as the primary interface, supporting model-agnostic integration (OpenAI, Anthropic, Gemini, etc.) and managing complex components like function tools and dependency injection. This structure ensures reliable, maintainable, and scalable AI workflows for production environments.
  • Redis
    Redis is the ultra-fast, open-source, in-memory data structure store: a powerful NoSQL key/value database.
    This is your go-to for low-latency data operations. Redis operates primarily in memory, delivering sub-millisecond response times for real-time applications (think: session storage, leaderboards, and caching). It functions as more than just a key/value store; it’s a versatile data structure server supporting Strings, Hashes, Lists, Sets, Sorted Sets, and JSON. Leverage its Pub/Sub capabilities for message brokering, or rely on its optional persistence for durability. Deploy it for high-speed caching to offload your primary database, or use it as a primary database for high-throughput microservices.
  • Haiku
    Haiku is a fast, open-source operating system, a community-driven continuation of the BeOS platform, specifically targeting efficient personal computing.
    Haiku, originally OpenBeOS, is a free, open-source operating system that directly succeeds the BeOS architecture; development began in 2001. The system is built for responsiveness, featuring a fully threaded design for maximum efficiency on multi-core CPUs and a custom hybrid kernel derived from NewOS. It utilizes the Be File System (BFS), which supports indexed metadata, treating the file system like a database. The entire project (kernel, drivers, toolkit, and desktop applications) is written by a single team, ensuring a unique level of consistency and a cohesive object-oriented API for accelerated C++ development.
  • MCP
    MCP is the open-source standard for securely connecting AI agents (like LLMs) to external tools, data, and enterprise workflows.
    The Model Context Protocol (MCP) functions as a standardized integration layer: think of it as a USB-C port for AI applications. Developed and open-sourced by Anthropic, this protocol allows large language models (LLMs) to access real-time context and execute actions via external tools like GitHub, Jira, or proprietary databases . It uses a simple JSON-RPC interface to define tools, schemas, and endpoints, which enables AI agents to perform complex, state-changing tasks—such as creating a GitHub issue or running a test script—rather than just generating text . MCP is essential for building agentic AI systems that can autonomously pursue goals and operate within defined safety and permission boundaries .
  • LLM
    Large Language Models (LLMs) are deep learning models, built on the Transformer architecture, that process and generate human-quality text and code at scale.
    LLMs are a class of foundation models: massive, pre-trained neural networks (often with billions to trillions of parameters) that leverage the self-attention mechanism of the Transformer architecture (introduced in 2017) to predict the next token in a sequence. Trained on vast datasets (e.g., Common Crawl's 50 billion+ web pages), these models—like GPT-4, Gemini, and Claude—acquire predictive power over syntax and semantics. They function as general-purpose sequence models, enabling critical applications such as complex content generation, language translation, and automated code completion (e.g., GitHub Copilot). Their core value: generalizing across diverse tasks with minimal task-specific fine-tuning.