Python for CAs
Real, runnable Python code for real CA problems. File automation, Excel processing, PDF extraction, bulk email, GST validation.
Everything in this section is real code. Not pseudocode. Not simplified examples. You can run every script here right now if you have Python installed.
For each example, we show you:
- The problem it solves (the CA scenario)
- The code Claude would generate
- The prompt you would give Claude Code to get it
- How to install any required libraries
Before You Start: Installing Libraries
Python libraries are pre-built toolkits that someone else has already written. You install them once with pip and then they are available to use in any script.
Open the command prompt
Press Windows key + R, type cmd, press Enter.
Install the core CA libraries
Run each of these commands one at a time, pressing Enter after each one and waiting for it to finish before running the next:
pip install openpyxl
pip install pandas
pip install pdfplumber
pip install pypdf
Verify the installs
After all four complete, run pip list and scroll through the output. You should see openpyxl, pandas, pdfplumber, and pypdf listed.
openpyxl and pandas work with Excel files. pdfplumber extracts text from PDFs where you can select and copy text. pypdf merges and splits PDF files. These four libraries cover the majority of CA automation tasks.Category 1: File and Folder Automation
Problem: Rename 500 PDF invoices to a standard format
The scenario: Your firm receives invoices from vendors in whatever format they use. Every month you spend hours renaming them to your firm's standard: YYYY-MM_VendorName_InvoiceNum.pdf. One script. Done.
The prompt to give Claude Code: "Write Python that renames all PDF files in a folder. The current filenames are like 'Invoice_2024_HDFC_12345.pdf'. Rename them all to 'YYYY-MM_VendorName_InvoiceNumber.pdf' format, where the year and month come from today's date, the vendor name is everything after 'Invoice_' and before the first underscore after that, and the invoice number is the last number before '.pdf'. Show me a preview before making any changes."
Problem: Organize a Downloads folder into subfolders by file type
The scenario: Your Downloads folder is chaos. PDFs, Excel files, images, Word documents all mixed together. One script to organize them all.
Category 2: Excel and Data Automation
Problem: Combine 50 monthly Excel reports into one master sheet
The scenario: Each month, 12 branch offices send you an Excel file with their P&L data. You spend 2 hours every month copying and pasting them into a master sheet. This script does it in 10 seconds.
Problem: Find duplicate entries across multiple Excel files
The scenario: You have client data across 3 different Excel files and suspect there are duplicate entries (same PAN, same GST number, or same company name appearing in more than one file). Find them all.
Category 3: PDF Processing
Problem: Extract invoice totals from 200 PDF invoices
The scenario: You have 200 vendor invoices as PDFs. You need the invoice date, vendor name, and total amount from each one — exported to Excel for reconciliation. Doing this manually takes a full day.
Important note on PDF extraction: PDFs come in two types — text-based (where you can select and copy text) and image-based (scanned documents where text is a picture). pdfplumber works well on text-based PDFs. For scanned/image PDFs, you need OCR (Optical Character Recognition) — a more complex step. Always test on a sample of your actual PDFs first.
Problem: Merge multiple PDFs into one document
The scenario: You need to combine a client's 12 monthly bank statements (12 separate PDFs) into one PDF for a bank loan application. Doing it manually in Adobe Acrobat costs a license fee and takes time.
Category 4: Email Automation
Problem: Send 100 personalized client emails from an Excel list
The scenario: Monthly statement dispatch. You have an Excel file with columns: Client Name, Email, Outstanding Amount, Due Date. You need to send each client a personalized email with their specific figures.
Gmail setup required: To send via Gmail, you need to enable "App Passwords" in your Google Account settings (under Security → 2-Step Verification → App Passwords). Do not use your main Google password in this script — create an App Password specifically for this.
Category 5: GST and Compliance
Problem: Validate GST numbers from a client list
The scenario: You receive a vendor list with 500 GST numbers from a client. Before processing GSTR-2A reconciliation, you need to validate that all the GST numbers are in the correct format (15-character alphanumeric with specific structure). Find the invalid ones immediately.
Problem: Validate PAN number formats in a list
The scenario: You receive a list of client PAN numbers and need to identify which ones are incorrectly formatted before submitting to the income tax portal.
The Prompt Pattern for Getting Good Code
For every automation task, structure your Claude Code prompt like this:
- What files are being read — exact folder path and filename pattern
- What transformation to apply — specific logic, not vague goals
- What the output should be — exact filename and format
- What a preview looks like — ask to show changes before making them
- What error handling you want — what to do if a file is missing or a field is blank
The more specific you are about column names, file paths, expected formats, and what "done" looks like — the less back-and-forth you need.
Next: Your first automation — run it yourself, from scratch.