Build an Agentic AI System in Python with the Claude Agent SDK

What Is the Claude Agent SDK?

Last year we built an AI agent from scratch using raw API calls and a while loop. That’s still the right approach if you want to understand how agents work. But if you want to ship one? There’s a faster path now.

The Claude Agent SDK is Anthropic’s official Python library for building autonomous AI agents. It gives you a full-featured agent out of the box: tool calling, file operations, bash commands, web search, session management, and MCP server integration. One function call, and you have an agent that can read files, write code, run shell commands, and browse the web.

The SDK was previously called “Claude Code SDK” and was renamed to Claude Agent SDK in late 2025. If you see old imports like claude_code_sdk in blog posts, that’s the same thing.

Install and Setup

Install from PyPI:

pip install claude-agent-sdk

Requires Python 3.10+. The package bundles the Claude Code CLI automatically, so you don’t need to install anything else.

Set your API key:

export ANTHROPIC_API_KEY=your-api-key-here

That’s it. No config files, no YAML, no twelve-step setup ritual.

Your First Agent: query() in 6 Lines

The simplest way to run an agent is the query() function. It creates a new session, sends your prompt, and streams back messages as the agent works:

import anyio
from claude_agent_sdk import query

async def main():
    async for message in query(prompt="What files are in the current directory?"):
        print(message)

anyio.run(main)

That’s a working agent. It will look at your filesystem, list the files, and respond. No tool definitions. No JSON schemas. No while loop. The SDK handles the entire agentic loop internally: it sends the prompt to Claude, Claude decides which tools to call, the SDK executes them, feeds results back, and iterates until there’s a final answer.

Compare that to writing your own tool-calling loop (which takes ~80 lines minimum). The tradeoff is clear: you lose visibility into the loop, but you gain a production-grade agent in a handful of lines.

Configuring Agent Behavior with ClaudeAgentOptions

The query() function takes an optional ClaudeAgentOptions object that controls what the agent can do, where it runs, and how it behaves:

import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions

async def main():
    options = ClaudeAgentOptions(
        system_prompt="You are a senior Python developer. Write clean, well-tested code.",
        allowed_tools=["Read", "Write", "Edit", "Bash", "Glob", "Grep"],
        cwd="/home/user/my-project",
        permission_mode="acceptEdits",
    )

    async for message in query(
        prompt="Add type hints to all functions in utils.py",
        options=options,
    ):
        print(message)

asyncio.run(main())

Key options worth knowing:

system_prompt: Sets the agent’s persona and instructions
allowed_tools: Whitelist of tools the agent can use. Built-in tools include Read, Write, Edit, Bash, Glob, Grep, WebSearch, WebFetch, and more
cwd: Working directory for file operations and bash commands
permission_mode: Controls how the agent handles permissions. "acceptEdits" auto-approves file changes; "bypassPermissions" skips all permission checks (use in sandboxed environments only)

AI agent with built-in tools for file operations, web search, and bash commands — Your agent comes pre-loaded with tools. You just decide which ones to hand over.

Built-in Tools: What the Agent Can Do Out of the Box

This is where the SDK really shines compared to rolling your own. You don’t define tools. The agent already has them. Here’s what’s available:

Tool	What It Does
`Read`	Read files from disk
`Write`	Create or overwrite files
`Edit`	Make targeted edits to existing files
`Bash`	Run shell commands
`Glob`	Find files by pattern
`Grep`	Search file contents
`WebSearch`	Search the web
`WebFetch`	Fetch content from URLs
`Task`	Spawn sub-agents for parallel work
`NotebookEdit`	Edit Jupyter notebooks

You control the surface area with allowed_tools. Want a read-only research agent? Give it only ["Read", "Glob", "Grep", "WebSearch", "WebFetch"]. Want a full coding agent? Give it everything. The agent figures out which tools to use and in what order.

Processing Responses: What Comes Back

The query() function yields message objects as the agent works. You’ll want to filter for the ones that matter:

from claude_agent_sdk import query, AssistantMessage, ResultMessage, TextBlock

async for message in query(prompt="Explain the main function in app.py"):
    if isinstance(message, AssistantMessage):
        for block in message.content:
            if isinstance(block, TextBlock):
                print(block.text)
    elif isinstance(message, ResultMessage):
        print(f"Done: {message.subtype}")

The stream includes system messages (session init, tool use notifications), assistant messages (the actual text output), and result messages (completion signals). For most use cases, you only care about AssistantMessage with TextBlock content.

Adding Custom Tools with MCP Servers

The SDK’s built-in tools cover file operations and web access. But what if your agent needs to query a database, call your internal API, or interact with a third-party service?

That’s where MCP (Model Context Protocol) comes in. MCP is an open standard that lets you expose custom tools to the agent. If you’ve read our tutorial on building an MCP server, you already know the drill.

You can connect external MCP servers directly in the options:

options = ClaudeAgentOptions(
    mcp_servers={
        "my-api": {
            "command": "python",
            "args": ["my_mcp_server.py"],
        }
    },
    allowed_tools=[
        "Read", "Write", "Bash",
        "mcp__my-api__*",  # all tools from the my-api server
    ],
)

The naming convention is mcp__<server-name>__<tool-name>. Use * to allow all tools from a server, or list specific ones like mcp__my-api__get_user.

You can also define tools inline using the SDK’s built-in MCP server helpers:

from claude_agent_sdk import tool, create_sdk_mcp_server

@tool("get_stock_price", "Get the current price for a stock ticker", {"ticker": str})
async def get_stock_price(args):
    ticker = args["ticker"].upper()
    # Your actual API call here
    price = fetch_price(ticker)
    return {"content": [{"type": "text", "text": f"{ticker}: ${price}"}]}

server = create_sdk_mcp_server(
    name="stocks",
    version="1.0.0",
    tools=[get_stock_price]
)

This is the same MCP protocol used by Claude Desktop, Cursor, and every other MCP-compatible client. Tools you build here work everywhere.

Multi-turn conversation between human and AI agent with context retention — Agents that remember what you said three questions ago. Revolutionary, apparently.

Multi-Turn Conversations with ClaudeSDKClient

query() is fire-and-forget. Each call starts a fresh session. If you need a back-and-forth conversation where the agent remembers context from previous turns, use ClaudeSDKClient:

from claude_agent_sdk import ClaudeSDKClient

async with ClaudeSDKClient() as client:
    # First turn
    await client.query("Read the README.md and summarize the project structure")
    async for msg in client.receive_response():
        print(msg)

    # Second turn - the agent remembers the first answer
    await client.query("Now create a CONTRIBUTING.md based on what you found")
    async for msg in client.receive_response():
        print(msg)

The client manages session state automatically. Each query() call continues the same conversation, so the agent has full context of everything that happened before. This is how you’d build a coding assistant, a research workflow, or any interactive application where follow-up questions matter.

A Real Example: Automated Code Review Agent

Let’s put it together. Here’s a practical agent that reviews a Python project, finds issues, and writes a report:

import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions, AssistantMessage, TextBlock

async def review_project(project_path: str) -> str:
    """Run an automated code review on a Python project."""
    options = ClaudeAgentOptions(
        system_prompt="""You are a senior Python code reviewer. 
Analyze the project for:
- Type safety issues (missing type hints, incorrect types)
- Security vulnerabilities (hardcoded secrets, SQL injection, etc.)
- Performance problems (N+1 queries, unnecessary loops)
- Code style violations (PEP 8, naming conventions)

Write your findings to review.md in the project root.
Be specific: include file names, line numbers, and suggested fixes.""",
        allowed_tools=["Read", "Write", "Glob", "Grep", "Bash"],
        cwd=project_path,
        permission_mode="acceptEdits",
    )

    result_text = []
    async for message in query(
        prompt="Review this Python project. Check every .py file.",
        options=options,
    ):
        if isinstance(message, AssistantMessage):
            for block in message.content:
                if isinstance(block, TextBlock):
                    result_text.append(block.text)

    return "n".join(result_text)

# Run it
report = asyncio.run(review_project("/path/to/your/project"))
print(report)

The agent will Glob for all .py files, Read each one, analyze the code, and Write its findings to review.md. It decides the order. It decides how deep to go. You just tell it what you want.

When to Use the Agent SDK vs. Raw API Calls

Use the Claude Agent SDK when:

You need file system access, bash execution, or web browsing
You want session management and conversation continuity
You’re building automations that touch the real world (code generation, system administration, research pipelines)
You want MCP server integration without writing the plumbing yourself

Use raw Anthropic API calls (the anthropic Python package) when:

You’re building a chatbot with simple text-in, text-out
You need fine-grained control over the tool-calling loop
You want to define custom tools without the MCP protocol
You’re running in environments where the CLI can’t be installed

And use the from-scratch approach when you want to understand how agents actually work under the hood before reaching for an abstraction.

What Happens Under the Hood

The SDK is essentially a Python wrapper around the Claude Code CLI. When you call query(), it:

Spawns a Claude Code process with your options
Sends your prompt via the CLI’s input protocol
The CLI handles the agentic loop: sending messages to the Claude API, executing tool calls, feeding results back, compacting context when it gets too long
Streams structured messages back to your Python code as an async iterator

This means you get the same capabilities as running claude from the terminal, but programmatically. The agent can do anything Claude Code can do: edit files across a codebase, run test suites, search the web, even spawn sub-agents for parallel work.

The tradeoff: since it shells out to the CLI, you need the CLI installed (the pip package handles this), and there’s slightly more overhead than direct API calls. For most use cases, the convenience is worth it.

If you want to see what this kind of architecture looks like in a commercial product, I reverse-engineered Cursor’s entire agent system by intercepting its API calls. Same building blocks: a system prompt, a tool registry, MCP integration, and a model routing layer. The difference between that 10,000-word system prompt and your 6-line query() call is the Claude Agent SDK doing the heavy lifting for you.

Next Steps

The Claude Agent SDK turns what used to be a weekend project into a 10-minute setup. Install the package, write a system prompt, pick your tools, and you have an agent that can read your codebase, run commands, search the web, and make decisions autonomously.

If you want to go deeper, try building a custom MCP server to give your agent access to your own APIs. We covered exactly how to do that in our MCP server tutorial.

The full Python SDK documentation is on Anthropic’s site. The source code is on GitHub.