Skip to content
Anthropic Claude API: Complete Developer Guide

Anthropic Claude API: Complete Developer Guide

DodaTech Updated Jun 20, 2026 8 min read

The Anthropic Claude API provides access to Claude’s advanced language models with industry-leading safety features, extended thinking capabilities, and powerful tool use patterns. This guide covers everything from setup to production deployment.

Learning Path

    flowchart LR
  A["OpenAI API<br/>Basics"] --> B["Anthropic Claude API<br/>Developer Guide"]
  B --> C["LangChain<br/>LLM Applications"]
  C --> D["DeepSeek API<br/>Open-Source LLMs"]
  style B fill:#f90,color:#fff,stroke-width:2px
  
What you’ll learn: Complete Claude API integration — Messages API, system prompts, streaming, tool use, vision, prompt caching, and cost optimization strategies. Why it matters: Claude excels at nuanced instruction following, long-context reasoning (200K tokens), and safe AI deployment — critical for production systems. Real-world use: Doda Browser uses Claude for privacy-preserving content analysis. Durga Antivirus Pro leverages Claude’s extended thinking for complex threat analysis requiring step-by-step reasoning.

Setting Up the Client

Install the Anthropic SDK and initialize the client:

pip install anthropic
import anthropic
import os

client = anthropic.Anthropic(
    api_key=os.environ["ANTHROPIC_API_KEY"]
)

Expected output: No output — client initializes silently. Missing key raises anthropic.AnthropicError.

Messages API

Claude uses a Messages API where you alternate user and assistant roles:

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=200,
    system="You are a concise Python expert. Keep answers under three sentences.",
    messages=[
        {"role": "user", "content": "What is a Python decorator?"}
    ]
)
print(response.content[0].text)

Expected output:

A decorator is a function that extends another function's behavior without modifying it directly. You apply it with the @decorator syntax above the target function. Common examples include @staticmethod, @classmethod, and custom decorators for logging or access control.

Key Parameters

ParameterTypeDescription
modelstringModel ID (e.g., claude-3-5-sonnet-20241022)
max_tokensintegerMaximum tokens in response (required, no default)
systemstringSystem prompt (persistent instructions)
temperaturefloatRandomness (0.0-1.0, default 0.7)
top_pfloatNucleus sampling (0.0-1.0, default 0.9)
top_kintegerTop-k sampling (default: not set)
stop_sequencesstring[]Custom stop sequences
metadataobjectUser ID, tags for tracking

System Prompts

Claude’s system prompt is separate from the message array — it provides persistent instructions:

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=300,
    system="You are a security analyst at DodaTech. Analyze code for vulnerabilities. "
           "Format: Risk Level | Affected Lines | Remediation Steps.",
    messages=[
        {"role": "user", "content": "Check: query = f'SELECT * FROM users WHERE id = {user_input}'"}
    ]
)
print(response.content[0].text)

Expected output:

Risk Level: CRITICAL
Affected Lines: Line 1 — f-string interpolation of user_input
Remediation: Use parameterized queries with placeholders (%s) instead of f-strings.

Temperature and Top-P

Control response randomness:

def generate_with_params(prompt, temperature=0.7, top_p=0.9):
    """Generate with different temperature and top_p values."""
    response = client.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=150,
        temperature=temperature,
        top_p=top_p,
        messages=[{"role": "user", "content": prompt}]
    )
    return response.content[0].text

# Low temperature (deterministic, factual)
print("Temp 0.1:", generate_with_params("What is 2+2?", 0.1))

# High temperature (creative, varied)
print("Temp 0.9:", generate_with_params("Write a short slogan for an antivirus app", 0.9))

Expected output:

Temp 0.1: 2+2 equals 4.
Temp 0.9: "Shield your world, silence the threats."

Streaming Responses

Stream Claude’s responses for real-time display:

def stream_response(prompt):
    """Stream Claude response token by token."""
    collected = []
    with client.messages.stream(
        model="claude-3-5-sonnet-20241022",
        max_tokens=300,
        messages=[{"role": "user", "content": prompt}]
    ) as stream:
        for text in stream.text_stream:
            collected.append(text)
            print(text, end="", flush=True)
    print()
    return "".join(collected)

full_response = stream_response("Explain caching in 3 bullet points")
print(f"\nTotal chars received: {len(full_response)}")

Expected output:

• Cache stores frequently accessed data in fast memory
• Reduces latency by avoiding repeated expensive computations
• Common strategies: LRU, TTL, write-through, write-behind

Total chars received: 145

Tool Use (Function Calling)

Define tools using a JSON schema. Claude responds with structured tool calls:

import json

tools = [
    {
        "name": "search_threat_database",
        "description": "Look up a file hash in the threat intelligence database",
        "input_schema": {
            "type": "object",
            "properties": {
                "hash": {"type": "string", "description": "SHA256 hash of the file"}
            },
            "required": ["hash"]
        }
    },
    {
        "name": "get_file_metadata",
        "description": "Get metadata about a file",
        "input_schema": {
            "type": "object",
            "properties": {
                "filepath": {"type": "string"},
                "max_size": {"type": "integer", "description": "Max size in bytes"}
            },
            "required": ["filepath"]
        }
    }
]

def call_with_tools(prompt):
    """Send a prompt with tool definitions."""
    response = client.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=500,
        tools=tools,
        messages=[{"role": "user", "content": prompt}]
    )
    for block in response.content:
        if block.type == "tool_use":
            print(f"Tool called: {block.name}")
            print(f"Input: {json.dumps(block.input, indent=2)}")
    return response

call_with_tools("Check if file a1b2c3d4 is a known threat and get metadata for /etc/passwd")

Expected output:

Tool called: search_threat_database
Input: {
  "hash": "a1b2c3d4"
}
Tool called: get_file_metadata
Input: {
  "filepath": "/etc/passwd"
}

Vision Capabilities

Claude can analyze images:

import base64

def analyze_image(image_path, prompt):
    """Send an image to Claude for analysis."""
    with open(image_path, "rb") as f:
        image_data = base64.b64encode(f.read()).decode("utf-8")
    
    response = client.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=500,
        messages=[
            {
                "role": "user",
                "content": [
                    {"type": "image", "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data
                    }},
                    {"type": "text", "text": prompt}
                ]
            }
        ]
    )
    return response.content[0].text

# result = analyze_image("screenshot.png", "What's in this image?")
# print(result)

Prompt Caching

Claude can cache system prompts and large context blocks — reducing cost and latency:

def cached_analysis(base_context, new_query):
    """Use prompt caching for repeated analysis with different queries."""
    response = client.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=300,
        system=[
            {
                "type": "text",
                "text": "You are a security document analyzer.",
                "cache_control": {"type": "ephemeral"}
            }
        ],
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "text",
                        "text": base_context,
                        "cache_control": {"type": "ephemeral"}
                    },
                    {"type": "text", "text": new_query}
                ]
            }
        ]
    )
    return response.content[0].text

# First call (caches the document)
first = cached_analysis(large_document, "Summarize this document")
print("First call - caching enabled")

# Second call (reuses cache)
second = cached_analysis(large_document, "Find security risks")
print("Second call - cache hit, lower latency and cost")

Cost Optimization

def estimate_cost(model, input_tokens, output_tokens):
    """Estimate API cost for Claude models."""
    pricing = {
        "claude-3-5-sonnet-20241022": {"input": 3.00, "output": 15.00},
        "claude-3-5-haiku-20241022": {"input": 0.80, "output": 4.00},
        "claude-3-opus-20240229": {"input": 15.00, "output": 75.00},
    }
    
    if model not in pricing:
        return "Unknown model"
    
    cost = (input_tokens / 1_000_000 * pricing[model]["input"] +
            output_tokens / 1_000_000 * pricing[model]["output"])
    return f"${cost:.4f}"

print(f"Sonnet 200+50 tokens: {estimate_cost('claude-3-5-sonnet-20241022', 200, 50)}")
print(f"Haiku 200+50 tokens: {estimate_cost('claude-3-5-haiku-20241022', 200, 50)}")
print(f"Opus 1000+200 tokens: {estimate_cost('claude-3-opus-20240229', 1000, 200)}")

Expected output:

Sonnet 200+50 tokens: $0.0014
Haiku 200+50 tokens: $0.0004
Opus 1000+200 tokens: $0.0300

Common Errors

  1. InvalidAuthenticationError — API key missing or incorrect. Keys start with sk-ant-. Set via environment variable or constructor parameter.
  2. OverloadedError — Servers at capacity. Implement exponential backoff retry. The anthropic.AnthropicError base class catches all error types.
  3. ContextLengthExceededError — Conversation exceeds 200K token limit. Implement conversation summarization or truncate older messages.
  4. InvalidRequestError — Missing max_tokens — The max_tokens parameter is required and has no default. Always specify it.
  5. Tool Use format mismatch — Every tool_use content block must be answered with a corresponding tool_result using the same tool_use_id.
  6. Safety filter triggered — Claude may refuse harmful requests. Review prompts against usage guidelines. Claude is more cautious than GPT-4 by design.
  7. Model not available — The requested model is deprecated or unavailable. Pin to a specific version like claude-3-5-sonnet-20241022.

Practice Questions

1. What’s the difference between the system parameter and including instructions in user messages? The system parameter sets persistent instructions across the entire conversation, while user message instructions apply only to that turn. System prompts are processed more efficiently by Claude.

2. How does Claude’s tool use work? You define tools with JSON schemas. Claude responds with tool_use content blocks. You execute the tool and return results via tool_result blocks.

3. What is the maximum context window for Claude 3.5? 200,000 tokens — enough for documents of 150+ pages or extended conversation histories.

4. How do you enable streaming? Use client.messages.stream() instead of client.messages.create(). Iterate over stream.text_stream for tokens or use event handlers.

5. Challenge: Build a multi-tool security analyzer Build a Python script that takes a file, sends it to Claude with tools for hash lookup, metadata analysis, and pattern matching, and returns a comprehensive security report.

Mini Project: Claude Chat with History

import anthropic
import os
import json

class ClaudeChat:
    """Conversation manager with history tracking."""
    
    def __init__(self, system_prompt="You are a helpful assistant."):
        self.client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])
        self.system = system_prompt
        self.history = []
    
    def send(self, message):
        """Send a message and get a response."""
        self.history.append({"role": "user", "content": message})
        
        response = self.client.messages.create(
            model="claude-3-5-sonnet-20241022",
            max_tokens=500,
            system=self.system,
            messages=self.history
        )
        
        reply = response.content[0].text
        self.history.append({"role": "assistant", "content": reply})
        return reply
    
    def token_estimate(self):
        """Rough token count of conversation history."""
        total = sum(len(m["content"].split()) for m in self.history)
        return total

chat = ClaudeChat("You are a DodaTech coding mentor.")
print(chat.send("What's a Python generator?"))
print(f"Conversation tokens: ~{chat.token_estimate()}")

FAQ

How does Claude compare to GPT-4 for coding tasks?
Claude 3.5 Sonnet performs comparably to GPT-4 for most coding tasks. Claude excels at following complex instructions. GPT-4 has a broader ecosystem with plugins and function calling maturity.
What is prompt caching in Claude API?
Prompt caching stores frequently used context (system prompts, documents) on Anthropic’s servers for 5 minutes. Repeated requests reuse the cache, reducing latency by 2-5x and cost by up to 50%.
Can I use Claude for free?
Anthropic offers a free tier via claude.ai (the chat interface) with limited usage. The API is pay-as-you-go. New accounts receive $5 in free credits.
What models does Anthropic offer?
Claude 3.5 Sonnet (best balance), Claude 3.5 Haiku (fastest, cheapest), Claude 3 Opus (most capable, expensive). Sonnet is recommended for most production use cases.
How do I handle Claude’s safety refusals?
Review the prompt for content that may trigger safety filters. Rephrase to be more specific about legitimate use cases. For security research, include context about your authorized testing environment.

Related Tutorials


Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro. Updated 2026-06-20.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro