image 20

Automate Document Analysis with Claude 3 Opus (A Guide)

Your New Intern Reads 500 Pages a Minute and Never Complains

I once spent a week locked in a room with a bad coffee machine and a 400-page stack of due diligence documents for a potential acquisition. My job was to find any ‘gotchas’—weird clauses, hidden liabilities, anything that could sink the deal. My eyes were burning, my brain was mush, and by day four, I was pretty sure the term ‘indemnification’ was mocking me in my sleep.

We paid a team of lawyers $800 an hour to do the *exact same thing*. The final bill was… significant. Enough to buy a decent car. Or a lifetime supply of better coffee.

What if you had an intern who could read the entire stack in about 90 seconds, highlight every key term, compare clauses against your standard templates, and present it all in a neat, structured summary? An intern that works 24/7, costs pennies per document, and never asks for a day off. That’s not science fiction anymore. That’s Claude 3 Opus, and today, I’m going to show you how to put it to work.

Why This Matters (It’s Not About Being Lazy, It’s About Leverage)

Let’s be blunt. Reading dense, technical, or legal documents is a soul-crushing, high-stakes bottleneck in almost every business.

  • It Costs a Fortune: Every hour a lawyer, consultant, or senior analyst spends reading is an hour you’re paying for. Automating the first-pass review can slash those bills by 80% or more. You bring in the expensive humans for high-level strategy, not word-for-word scanning.
  • It Wastes Your Time: As a founder or executive, you should be closing deals, not deciphering paragraphs of legalese. Getting an AI-generated summary lets you understand the key points in minutes, not days.
  • Humans Make Mistakes: I don’t care how much coffee you’ve had. After reading 50 pages of a dry contract, your attention drifts. You might miss that one sentence about auto-renewal or an unlimited liability clause. An AI doesn’t get tired or bored.
  • You Move Faster: Imagine being able to analyze a potential client’s RFP (Request for Proposal) in 5 minutes to see if it’s a good fit, instead of assigning a sales engineer to study it for a day. That’s a massive competitive advantage.

This isn’t about replacing professionals. It’s about giving them—and you—superpowers. It’s about turning a mountain of unstructured text into clean, actionable data.

What This Workflow Actually Is

Okay, so what are we really doing here? We’re using an API (an ‘Application Programming Interface’) to send a document to an incredibly powerful AI model called Claude 3 Opus, made by a company called Anthropic.

Think of it like this: You have a brilliant, hyper-specialized research assistant. You can’t just throw a book at them and say “tell me stuff.” You have to give them specific instructions.

Our workflow is a digital version of that:

  1. The Document: This is our raw material. A contract, a financial report, a research paper.
  2. The Prompt: This is our set of instructions. It’s the most important part. We don’t just say “summarize this.” We say, “Act like an expert paralegal. Read this contract and extract the exact clauses for Payment Terms, Liability Limits, and Termination. Give me the output in JSON format.”
  3. The AI (Claude 3 Opus): This is our assistant. It’s incredibly good at two things that make it perfect for this job: understanding nuanced language and handling huge amounts of text (its ‘context window’ is 200,000 tokens, which is like a 500-page book).
  4. The Output: The AI sends back the information exactly as we requested it—clean, structured, and ready to be used.

We’re essentially turning a messy PDF into a clean spreadsheet or database entry, automatically.

Prerequisites (The Honest, No-Fluff List)

I’m not going to lie and say a hamster could do this. But it’s easier than you think. Here’s what you actually need:

  • An Anthropic API Key: Go to Anthropic’s website, sign up, and get an API key. Yes, it costs money to use. But we’re talking cents or a few dollars per complex document, not hundreds per hour. Put a small amount of credits on your account (e.g., $10) to start. It will last you a long time.
  • Python Installed: If you’ve never used Python, don’t panic. It’s a programming language, but we’re just going to use it as a tool to send our document and prompt to the API. Go to the official Python website and download the latest version.
  • A Text Editor: Something simple like VS Code, Sublime Text, or even Notepad++ will do. This is just where you’ll write or paste the script.
  • Your Document: For this tutorial, we’ll need the document saved as a plain text file (`.txt`). If you have a PDF, you’ll first need to copy and paste its contents into a text file. (More advanced methods can read PDFs directly, but we’re starting simple).

That’s it. You don’t need to be a developer. You just need to be able to follow instructions and copy-paste.

Step-by-Step Tutorial: Analyzing a Services Agreement

Let’s do something real. We’re going to analyze a sample Master Services Agreement (MSA) to pull out the most important clauses a business owner would care about.

Step 1: Set Up Your Project

First, create a new folder on your computer. Let’s call it `ai_analyzer`.

Inside this folder, create a new text file named `sample_contract.txt`. Paste the following fake contract text into it. It’s intentionally a bit wordy, like a real contract.

MASTER SERVICES AGREEMENT

This Agreement is made between "ClientCorp" and "VendorFlow AI".

1. Services. VendorFlow AI agrees to provide AI-driven data analysis services as described in individual Statements of Work (SOWs).

2. Payment Terms. ClientCorp shall pay all invoices within thirty (30) days of receipt. Late payments will incur a penalty of 1.5% per month. The annual service fee is set at $50,000 USD, payable in quarterly installments.

3. Confidentiality. Both parties agree to maintain the confidentiality of all proprietary information disclosed during the term of this agreement. This obligation shall survive the termination of this agreement for a period of five (5) years.

4. Term and Termination. This agreement shall commence on the effective date and continue for a period of one (1) year. It will automatically renew for subsequent one-year terms unless either party provides written notice of non-renewal at least sixty (60) days prior to the end of the current term. Either party may terminate this agreement for cause if the other party is in material breach and fails to cure such breach within thirty (30) days of notice.

5. Limitation of Liability. Except for breaches of confidentiality or indemnification obligations, the total liability of VendorFlow AI under this agreement shall not exceed the total fees paid by ClientCorp in the twelve (12) months preceding the event giving rise to the claim.

6. Governing Law. This agreement shall be governed by and construed in accordance with the laws of the State of Delaware, without regard to its conflict of law principles.

Next, open your terminal or command prompt, navigate to your `ai_analyzer` folder, and install the Anthropic Python library:

pip install anthropic

Finally, you need to set your API key. DO NOT paste your key directly into your code. The best way is to set it as an environment variable.

On Mac/Linux:

export ANTHROPIC_API_KEY="your_api_key_here"

On Windows (Command Prompt):

set ANTHROPIC_API_KEY="your_api_key_here"

You’ll need to do this every time you open a new terminal window, or learn how to set it permanently.

Step 2: Create the Python Script

Inside your `ai_analyzer` folder, create a new file called `analyze.py`. This is where our code will live. Paste the following code into the file:

import anthropic
import os
import json

# This automatically looks for the ANTHROPIC_API_KEY environment variable
client = anthropic.Anthropic()

# --- 1. READ THE DOCUMENT ---
# We're opening the text file and reading its content into a variable.
def read_document(file_path):
    print(f"Reading document from {file_path}...")
    with open(file_path, 'r') as file:
        return file.read()

# --- 2. DEFINE THE ANALYSIS PROMPT ---
# This is the core of our instruction to the AI.
# We tell it its role, give it the context (the contract), and specify the task and output format.
def create_prompt(document_text):
    # We define exactly what we want to extract.
    desired_extractions = [
        "Payment Terms (including deadlines and penalties)",
        "Agreement Term (initial length and renewal conditions)",
        "Termination Conditions (how and why the contract can be ended)",
        "Limitation of Liability (the maximum financial risk)",
        "Governing Law (which state's laws apply)"
    ]

    # The prompt is a clear instruction set for the AI.
    prompt = f"""You are an expert corporate paralegal. Your task is to analyze the following Master Services Agreement and extract specific key information.

Here is the document:

{document_text}


Please extract the following pieces of information and present them in a clean JSON format. The keys of the JSON should be simple, snake_case versions of the requested items. For each item, provide both the exact text from the contract and a brief, one-sentence summary in plain English.

Requested Information:
{', '.join(desired_extractions)}

If a specific piece of information is not found in the document, use a value of null for its fields.

Format your response as a single JSON object. Do not include any other text or explanations before or after the JSON.
"""
    return prompt

# --- 3. CALL THE API AND GET THE ANALYSIS ---
def get_analysis(prompt):
    print("Sending document to Claude 3 Opus for analysis...")
    try:
        response = client.messages.create(
            model="claude-3-opus-20240229",
            max_tokens=2048,
            messages=[
                {
                    "role": "user",
                    "content": prompt
                }
            ]
        )
        # The actual content is in the first content block
        return response.content[0].text
    except Exception as e:
        return f"An error occurred: {e}"

# --- MAIN EXECUTION ---
if __name__ == "__main__":
    # Path to our contract file
    contract_file_path = "sample_contract.txt"

    # Step 1: Read the document
    contract_text = read_document(contract_file_path)

    # Step 2: Create the prompt
    analysis_prompt = create_prompt(contract_text)

    # Step 3: Get the analysis from Claude
    analysis_result_text = get_analysis(analysis_prompt)

    print("\
--- Analysis Complete ---")
    print(analysis_result_text)

    # Optional: Save the result to a file
    try:
        # Claude might return a string that is a JSON object, so we parse it
        parsed_json = json.loads(analysis_result_text)
        with open("analysis_output.json", 'w') as outfile:
            json.dump(parsed_json, outfile, indent=4)
        print("\
Successfully saved structured analysis to analysis_output.json")
    except json.JSONDecodeError:
        print("\
Could not parse the output as JSON. Saving raw text output.")
        with open("analysis_output.txt", 'w') as outfile:
            outfile.write(analysis_result_text)
Step 3: Run the Script and See the Magic

Go back to your terminal, make sure you are in the `ai_analyzer` directory, and run the script:

python analyze.py

You’ll see a couple of status messages, and then… behold! The script will print out a beautifully structured JSON object and save it to a file named `analysis_output.json`.

Your output should look something like this:

{
    "payment_terms": {
        "exact_text": "ClientCorp shall pay all invoices within thirty (30) days of receipt. Late payments will incur a penalty of 1.5% per month. The annual service fee is set at $50,000 USD, payable in quarterly installments.",
        "summary": "Invoices must be paid within 30 days, with a 1.5% monthly penalty for late payments, and the total annual fee is $50,000 paid quarterly."
    },
    "agreement_term": {
        "exact_text": "This agreement shall commence on the effective date and continue for a period of one (1) year. It will automatically renew for subsequent one-year terms unless either party provides written notice of non-renewal at least sixty (60) days prior to the end of the current term.",
        "summary": "The contract has an initial one-year term and auto-renews annually unless a 60-day non-renewal notice is given."
    },
    "termination_conditions": {
        "exact_text": "Either party may terminate this agreement for cause if the other party is in material breach and fails to cure such breach within thirty (30) days of notice.",
        "summary": "The contract can be terminated if one party materially breaches the agreement and doesn't fix the issue within 30 days of being notified."
    },
    "limitation_of_liability": {
        "exact_text": "Except for breaches of confidentiality or indemnification obligations, the total liability of VendorFlow AI under this agreement shall not exceed the total fees paid by ClientCorp in the twelve (12) months preceding the event giving rise to the claim.",
        "summary": "The vendor's maximum liability is capped at the amount of fees paid by the client in the previous 12 months."
    },
    "governing_law": {
        "exact_text": "This agreement shall be governed by and construed in accordance with the laws of the State of Delaware, without regard to its conflict of law principles.",
        "summary": "The agreement is governed by the laws of the State of Delaware."
    }
}

Look at that. You didn’t just get a summary. You got structured data. You now have a machine-readable breakdown of the contract that you can use in a million different ways.

Real Business Use Cases

This isn’t just a party trick. Here’s how you can use this exact workflow:

  1. Lease Agreement Review for Real Estate: You’re looking at ten different commercial leases. Create a script that extracts rent amount, lease term, renewal options, and maintenance responsibilities from all of them. You can then easily compare them in a spreadsheet to find the best deal.
  2. Vendor Security Compliance: Your company requires all vendors to be SOC 2 compliant. You can feed their security whitepapers or compliance documents to Opus with a prompt like: “Analyze this document and find any mention of SOC 2, ISO 27001, or GDPR compliance. Return ‘true’ if found, ‘false’ if not.”
  3. Financial Report Analysis for Investors: Feed a company’s quarterly 10-Q report to the AI. Ask it to extract ‘Net Revenue’, ‘Net Income’, and any sentences under the ‘Risk Factors’ section that mention ‘competition’ or ‘supply chain’. This is hours of work for a junior analyst, done in a minute.
Common Mistakes & Gotchas (How to Avoid Looking Like a Rookie)
  • Vague Prompting: “Summarize this” is a terrible prompt. It gives the AI too much freedom. Be specific. List exactly what you want. The more precise your instructions, the better your output.
  • Forgetting the Format Instruction: If you want JSON, explicitly ask for JSON. If you want a bulleted list, ask for it. Tell the AI *how* to structure its answer.
  • Trusting It Blindly (The ‘Not a Lawyer’ Disclaimer): This is a powerful *assistant*. It is not a substitute for professional legal or financial advice. Use this tool to do the first 90% of the work, flag key areas, and then bring in an expert to verify the critical parts. It saves them time and you money.
  • Ignoring Poor Source Quality: If you feed the model a badly scanned PDF with tons of OCR (Optical Character Recognition) errors, it will struggle. The quality of your input directly determines the quality of your output. Garbage in, garbage out.
How This Fits Into a Bigger Automation System

Running a single script is cool, but the real power comes when you connect it to other systems. This script isn’t an endpoint; it’s a gear in a much larger machine.

Imagine this automated pipeline:

  1. Trigger: A new email with the subject “New Vendor Contract” arrives in a specific inbox (you can set this up with tools like Zapier or Make.com).
  2. Action 1: The email attachment (the contract PDF) is automatically saved to a specific folder in Google Drive.
  3. Action 2: The new file in Google Drive triggers a cloud function (like an AWS Lambda or Google Cloud Function) that runs our Python script.
  4. Action 3: Our script calls Claude 3 Opus and gets back the structured JSON analysis.
  5. Action 4: The script then takes that JSON and does three things simultaneously:
    • Posts a summary message to a `#legal-review` Slack channel, tagging the legal team.
    • Creates a new task in Asana or Jira titled “Review Contract: [Vendor Name]” with the extracted clauses in the description.
    • Adds a row to a Google Sheet that tracks all your vendor contracts, automatically populating columns for liability caps, renewal dates, and payment terms.

Now you have a fully automated contract intake and triage system. A process that used to take days of manual work now happens in about two minutes, with zero human intervention until the final review.

What to Learn Next

You’ve just built a powerful tool that can read and understand a single, complex document. You’ve turned unstructured text into structured data. That’s a huge step.

But what if you have two documents? What if a client sends back your standard contract with a bunch of changes, and you need to know *exactly* what they altered without reading all 30 pages side-by-side?

In our next lesson, we’re going to upgrade this script. We’ll build a ‘Document Comparison Engine’. We’ll feed it two versions of a contract and use Claude Opus to instantly identify and summarize the differences, from a changed word in the liability clause to a completely new section they snuck in.

Stay tuned. The factory floor is just getting warmed up.

“,
“seo_tags”: “Claude 3 Opus, AI Automation, Document Analysis, Contract Analysis, Python, API Tutorial, Business Automation, Anthropic”,
“suggested_category”: “AI Automation Courses

Leave a Comment

Your email address will not be published. Required fields are marked *