The Intern Who Couldn’t Copy-Paste
Let’s talk about Chad. We hired an intern last summer, Chad. His only job was to read customer support emails and copy the key details into our CRM: name, order number, issue. Simple, right? A task a reasonably intelligent gerbil could master.
Chad was not that gerbil. He’d put the order number in the name field. He’d summarize a 300-word complaint as “customer mad.” He’d misspell email addresses, sending our replies into the digital void. Every entry had to be double-checked, defeating the whole purpose of hiring him. Chad wasn’t an intern; he was a chaos generator we paid $18 an hour.
We fired Chad. Well, we “let his internship conclude.” But his ghost haunts every business on earth: the ghost of manual, error-prone, mind-numbing data entry. Today, we’re going to build an AI that does Chad’s job perfectly, instantly, and for pennies. We’re going to build a digital ghostbuster.
Why This Matters
Every business runs on data, and most of that data starts as a chaotic mess. It’s buried in emails, support tickets, PDFs, and call transcripts. The process of getting that chaos into a structured system (like a database, a CRM, or even just a spreadsheet) is a massive, expensive bottleneck.
You either hire an army of Chads, burn out your best employees on boring work, or just let valuable data rot. This isn’t a small problem; it’s a tax on growth. It’s why your reporting is always a mess and why you can never find the information you need.
This automation replaces that entire bottleneck. It’s a tireless robot that reads unstructured text and hands you back a perfectly formatted, computer-readable chunk of data. It’s the difference between panning for gold in a river and having a machine that spits out pure gold bars. It saves time, eliminates human error, and lets you scale processes that were previously impossible.
What This Tool / Workflow Actually Is
We’re using a feature in Anthropic’s Claude 3 API called Tool Use. Now, the name is a bit confusing. You might think it means Claude can *use* tools like “book a flight” or “send an email.” It *can* do that, but that’s a more advanced topic.
For today, we’re using it in a simpler, more powerful way. We’re using it as a structured data extractor. Think of it like this: instead of asking Claude to just chat with you, you’re handing it a very specific form and saying, “Read this messy email and fill out this form. Do not deviate from the form. Do not get creative.”
What it does:
It forces the Large Language Model (LLM) to respond in a predictable, structured JSON format that you define. It takes chaotic input (an email) and produces clean, machine-readable output (JSON).
What it does NOT do:
In this tutorial, it does not *execute* anything. It won’t update your CRM for you or send an email. It simply does the most critical part: structuring the data so that another, simpler script can easily perform the execution step. It’s the brains, not the arms.
Prerequisites
This is way easier than it sounds. I promise.
- An Anthropic API Key. Go to the Anthropic Console, sign up, and create an API key. You’ll need to add a credit card, but for this tutorial, you’ll spend less than a single penny.
- Python. If you don’t have Python on your machine, don’t panic. You can use Google Colab for free, right in your browser. No installation needed.
- A willingness to copy and paste. If you can highlight text and press Ctrl+C, you have all the technical skills required today.
Step-by-Step Tutorial
Let’s build our Chad-replacement robot, step by step.
Step 1: Set up your environment
First, we need to install the Anthropic library. Open your terminal (or a new cell in Google Colab) and run this:
pip install anthropic
That’s it. You’re basically a developer now.
Step 2: Define the “Form” (Our Tool)
This is the magic part. We need to describe the data we want to extract. We’ll define a “tool” called submit_customer_issue. This is the “form” we’re giving to Claude. It has a name, a description of its purpose, and a schema defining the fields we want (customer_name, order_id, customer_email, issue_summary).
Notice how we describe each field clearly. This is a prompt for the AI. The clearer you are, the better it works.
customer_issue_tool = {
"name": "submit_customer_issue",
"description": "Extracts key information from a customer support email and submits it for processing.",
"input_schema": {
"type": "object",
"properties": {
"customer_name": {
"type": "string",
"description": "The full name of the customer."
},
"order_id": {
"type": "string",
"description": "The unique identifier for the customer's order, usually alphanumeric."
},
"customer_email": {
"type": "string",
"description": "The email address of the customer."
},
"issue_summary": {
"type": "string",
"description": "A brief, one-sentence summary of the customer's problem."
}
},
"required": ["customer_name", "order_id", "customer_email", "issue_summary"]
}
}
Step 3: Prepare the API Call
Now we write the Python script. We’ll import the library, create a client with our API key, and define the messy email text from our fictional customer, Brenda.
The key things to notice are the tools parameter, where we pass our form definition, and the tool_choice parameter. By setting tool_choice to {"type": "any"}, we’re telling Claude: “You MUST use one of the tools I’ve provided. You are not allowed to just chat.”
Step 4: Execute and Get the Structured Data
When we run the script, Claude won’t reply with, “Sure, I can help Brenda with that!” Instead, it will reply with a special message that essentially says: “I have analyzed the text. I believe you should use the `submit_customer_issue` tool with the following data.”
That data is our prize. It’s the perfectly structured JSON we wanted. Our script will then extract and print it.
Complete Automation Example
Here is the full, copy-paste-ready Python script. Replace 'YOUR_API_KEY' with your actual Anthropic API key.
import anthropic
import json
# 1. Configure the Anthropic client with your API key
client = anthropic.Anthropic(
api_key="YOUR_API_KEY",
)
# 2. Define the tool (the "form" for the AI to fill out)
customer_issue_tool = {
"name": "submit_customer_issue",
"description": "Extracts key information from a customer support email and submits it for processing.",
"input_schema": {
"type": "object",
"properties": {
"customer_name": {
"type": "string",
"description": "The full name of the customer."
},
"order_id": {
"type": "string",
"description": "The unique identifier for the customer's order, usually alphanumeric."
},
"customer_email": {
"type": "string",
"description": "The email address of the customer."
},
"issue_summary": {
"type": "string",
"description": "A brief, one-sentence summary of the customer's problem."
}
},
"required": ["customer_name", "order_id", "customer_email", "issue_summary"]
}
}
# 3. The messy, unstructured text we want to process
messy_email_text = """
Hi there,
My name is Brenda from accounting, and I'm having an issue with my recent order. The tracking isn't updating for order #A-12345-Z. Can someone please look into this? My email is brenda.p@examplecorp.com.
Thanks,
Brenda
"""
# 4. Make the API call
print("🤖 Calling Claude...")
message = client.messages.create(
model="claude-3-sonnet-20240229",
max_tokens=1024,
tools=[customer_issue_tool],
tool_choice={"type": "any"},
messages=[
{
"role": "user",
"content": messy_email_text
}
]
)
print("✅ API call complete!")
# 5. Extract the structured data
structured_data = None
for content_block in message.content:
if content_block.type == "tool_use":
tool_name = content_block.name
tool_input = content_block.input
if tool_name == "submit_customer_issue":
structured_data = tool_input
break
if structured_data:
print("\
--- Extracted Data ---")
# The `json.dumps` function makes the output pretty
print(json.dumps(structured_data, indent=2))
else:
print("\
--- No tool use detected in the response ---")
Expected Output:
When you run this script, you will get this beautiful, clean, structured output:
🤖 Calling Claude...
✅ API call complete!
--- Extracted Data ---
{
"customer_name": "Brenda",
"order_id": "A-12345-Z",
"customer_email": "brenda.p@examplecorp.com",
"issue_summary": "The tracking for order #A-12345-Z isn't updating."
}
Look at that. Perfect. Every single time. Goodbye, Chad.
Real Business Use Cases
This exact pattern can be applied to almost any industry.
- Real Estate Agency:
- Problem: Agents receive hundreds of emails with unstructured property descriptions from listing services.
- Solution: Use a tool schema to extract
address,square_footage,num_bedrooms,num_bathrooms, andasking_priceto automatically populate a property database.
- Recruiting Firm:
- Problem: Manually parsing thousands of resumes in different formats (PDF, Word) to find candidates.
- Solution: Convert resumes to text, then use a tool schema to extract
candidate_name,contact_info,years_of_experience,key_skills, andprevious_employersinto an applicant tracking system (ATS).
- Law Office:
- Problem: Paralegals spend hours reading legal filings to identify key entities.
- Solution: Use a tool schema to extract
plaintiff_name,defendant_name,case_number,filing_date, andcourt_jurisdictionfrom legal documents to quickly summarize and categorize new cases.
- E-commerce Store:
- Problem: Analyzing free-text product reviews to understand trends is time-consuming.
- Solution: Use a tool schema to process reviews and extract
product_sku,rating(inferred from text if not explicit),sentiment(positive/negative/neutral), andmentioned_features(e.g., ‘battery life’, ‘screen quality’).
- Financial Analyst:
- Problem: Reading quarterly earnings reports (long, dense PDFs) to pull out key financial numbers.
- Solution: Convert the report to text and use a tool schema to extract
total_revenue,net_income,earnings_per_share, andforward_guidance_summary.
Common Mistakes & Gotchas
- Vague Descriptions: The
descriptionfields in your schema are critical. They are part of the prompt."name": "order_id"is okay, but"description": "The unique alphanumeric identifier for the customer's order"is much better. Be specific. - Forgetting `tool_choice`: If you omit the
tool_choice={"type": "any"}parameter, Claude might decide to just give you a conversational answer instead of using your tool. For extraction, you almost always want to force it. - Schema Is Too Complex: Don’t try to extract 50 fields at once, especially from a short piece of text. If you need a lot of data, it’s better to use multiple, simpler tools and maybe even chain the calls. Start small and add fields one by one.
- Ignoring Missing Data: By default, if a field is in your
requiredlist and the AI can’t find it in the text, it might hallucinate or fail. Consider what should happen with optional data. You can remove fields from therequiredarray to make them optional.
How This Fits Into a Bigger Automation System
What we built today is a foundational component, like a specialized station on an assembly line. Its only job is to turn messy parts into standardized ones. Here’s how you connect it:
- CRM Automation: The next step is to take the output JSON and use another API (like the HubSpot or Salesforce API) to create or update a contact/ticket. This script becomes the bridge between your inbox and your CRM.
- Email Automation: Once you’ve extracted the data, you can pass it to *another* LLM call with a prompt like, “Here is the customer’s issue in a structured format. Draft a polite, empathetic reply confirming we’ve received their request and are looking into it.”
- Databases & Dashboards: Instead of a CRM, you could pipe this JSON data directly into a database (like PostgreSQL) or even a simple Google Sheet. Now you have a real-time dashboard of incoming issues, categorized and ready for analysis.
- Multi-Agent Workflows: This is your “Clerk Agent.” It does the intake. It can then pass its clean output to a “Researcher Agent” that uses the
order_idto look up details in your database, and then pass everything to a “Communications Agent” that drafts the final reply.
What to Learn Next
Congratulations. You just built a system that is genuinely more reliable and efficient than a human intern for a critical business task. You’ve turned unstructured chaos into structured value.
But right now, that clean data is just printing to your screen. It’s not *doing* anything yet. It’s like our factory has produced a perfect engine, but it’s just sitting on the floor, not yet in a car.
In our next lesson in the Academy, we’re going to build the next station on the assembly line. We will take the clean JSON from this script and, using code, automatically insert it as a new row in a Google Sheet. We’re going to build a completely automated, real-time logging system. No more copy-pasting, ever.
You’ve mastered extraction. Next, you master execution. See you in the next class.
“,
“seo_tags”: “Claude 3 API, AI Automation, Structured Data Extraction, Tool Use, Python, Anthropic, Business Automation”,
“suggested_category”: “AI Automation Courses

