5  Automating Reporting Cycles

Chapter 4 connected AI to a database. You typed a question, the AI wrote SQL, and you got an answer. That works for one-off questions, but real work rarely stops at a single query.

A weekly sales report requires pulling data, charting it, writing a narrative, and assembling a deck. A quarterly review adds trend analysis, comparisons to targets, and an executive summary. Each step depends on the one before it. Doing this manually every week is exactly the kind of repetitive, multi-step work that AI can automate.

This chapter shows how agents chain those steps into a reporting pipeline — and how you can use your coding agent to build the whole thing.

5.1 From query to report

In Chapter 4, the Chinook query app answered questions one at a time. You asked “What are the top ten countries by revenue?” and got a table. Useful, but a stakeholder does not want a table — they want a slide deck with a chart, key findings, and a recommendation.

Getting from a raw query to a finished report means executing a sequence of steps: run the SQL, transform the results, generate a visualization, write a summary, and assemble the output document. That sequence is a pipeline, and automating it is what agents do.

5.2 What is an agent?

An agent is an AI system that pursues a goal through a loop of planning, executing, observing, and iterating. The distinction from a chatbot is autonomy. A chatbot handles a single turn: you ask, it answers, and you decide what to do next. An agent handles multiple steps: it breaks the task down, calls tools in sequence, checks the results, and adjusts its approach — all from a single instruction.

The coding agents you have been using since Chapter 1 are agents. When you asked Claude Code or Gemini CLI to “build a Streamlit app that queries the Chinook database,” it planned the file structure, wrote the code, ran it, noticed errors, and fixed them — autonomously. That is the agent loop in action.

5.3 The agent loop

Every agent follows the same four-step loop.

  1. Plan — break the task into steps.
  2. Execute — call tools, run code, write files.
  3. Observe — check the results of each action.
  4. Iterate — adjust the plan based on what was observed.

The loop repeats until the task is complete. A reporting pipeline is a natural fit: query the database (execute), check that the data looks right (observe), generate a chart (execute), verify the chart matches the data (observe), write a summary (execute), and so on.

5.4 How agents use tools

The AI does not just generate text. It decides which tool to call, formats the request, and interprets the result.

Tools can be database queries, file operations, API calls, or code execution. The tool-use protocol is the same regardless: the AI receives a goal, selects the right tool, formats the call, interprets the result, and decides whether the task is complete or another tool call is needed.

For a reporting pipeline, the tools might be:

  • SQL query to pull revenue data from the database.
  • matplotlib to render a bar chart from the query results.
  • LLM call to generate a three-sentence executive summary from the data.
  • python-pptx to assemble a PowerPoint deck with a title slide, the chart, and the findings.

Four tools, one prompt. The agent orchestrates the entire pipeline.

5.5 The reporting pipeline

Here is the workflow, step by step.

Step 1: Query the database. The agent runs a SQL query against the Chinook database — the same kind of query the Chapter 4 app generates. The result is a pandas DataFrame: rows of data ready for analysis.

Step 2: Render a chart. The agent passes the DataFrame to matplotlib (or Plotly) and generates a visualization — a bar chart, line chart, or whatever fits the data. The chart is saved as a PNG image.

Step 3: Generate a narrative. The agent sends the data to an LLM and asks for a concise executive summary. The prompt includes the numbers so the summary is grounded in facts, not hallucinated.

Step 4: Assemble the deck. The agent uses python-pptx to build a three-slide PowerPoint: a title slide with the report name and date, a slide with the chart, and a slide with the key findings. The deck is saved as a .pptx file.

Step 5: Deliver. In a Streamlit app, the chart and summary appear on screen, and a download button lets the user grab the PowerPoint file. The same pattern from Chapter 4 — Streamlit as the web wrapper — but the output is a document instead of a table.

Each step is straightforward. The value of the agent is chaining them together so the user clicks one button and gets a finished report.

5.6 Building it with your coding agent

You do not need to write this app by hand. Your coding agent can build it for you. Here is an example prompt you can give it:

Build a Streamlit app that connects to chinook.db. Add a dropdown with five pre-defined reports: Revenue by Country, Monthly Revenue Trend, Top 10 Artists, Genre Breakdown, and Customer Spending. When the user selects a report and clicks Generate, run the SQL query, create a chart with matplotlib, call OpenRouter to write a three-sentence executive summary, and assemble a three-slide PowerPoint (title, chart, findings) using python-pptx. Show the chart and summary on screen with a download button for the PowerPoint file. Log every LLM request and response to llm_log.jsonl with timestamps.

This prompt is specific enough that a coding agent can execute it end to end. It names the database, the reports, the tools (matplotlib, python-pptx, OpenRouter), the output format, and the logging requirement.

Your job is not to write the code. Your job is to test the result, refine the prompt, and iterate until the output meets your standards. If the charts are ugly, tell the agent to improve the styling. If the summaries are too generic, tell it to include specific numbers. If a report query is wrong, describe the correct logic. Prompt engineering is the skill; the agent is the labor.

Notice the last line of the prompt: logging. Every LLM request and response is written to a JSONL file with timestamps. This creates an audit trail — you can see exactly what the AI was asked and what it said. Chapter 7 will use these logs for verification and governance.

5.7 Sandboxed execution

When AI runs code, you need to control where it runs. On your laptop during development, a coding agent runs in your environment with access to your files. That is fine for prototyping. In production, you want isolation.

Built-in sandboxing

Codex includes OS-level sandboxing by default — it runs generated code in an isolated environment even on your laptop. Claude Code and Gemini CLI rely on permission prompts to control what the agent can do, but do not sandbox execution. For production deployment, all three benefit from the container-based isolation described below.

A container (Docker is the standard) is a disposable, isolated computing environment. Think of it as a virtual computer that exists only for one task and is destroyed afterward. It can only access what you explicitly allow. A bug in the AI-generated code cannot affect your other files, your database, or your network.

The sandbox pattern for production: a user asks for a report through a web interface, the agent plans and orchestrates, code runs inside a sandboxed container, the container queries the database through a read-only connection, and the finished report flows back to the user. The user never touches the database. The agent never escapes the sandbox. This is defense in depth.

5.8 From prototype to production

The reporting app on your laptop is a prototype. Moving it to production adds infrastructure, not intelligence.

Production adds SSO authentication so only authorized users can generate reports, scheduled runs so weekly reports are generated automatically, logging so every query and LLM call is recorded for compliance, and read-only database credentials so the agent cannot modify data.

The reporting pipeline you prototyped is the same artifact that powers the production tool. IT wraps it in infrastructure; you provide the domain knowledge and the prompt.

5.9 Exercises

Open your coding agent in a folder containing the Chinook database and your SKILL.md from Chapter 4. Give it a single compound instruction that requires at least three steps.

For example: “Query the Chinook database for the top ten artists by revenue. Create a bar chart. Write a one-paragraph executive summary. Save the chart and summary to a reports folder.”

Observe the agent loop. How many distinct tool calls does the agent make? Does it check its own work? If the output is not right, refine your prompt and try again.

Use the example prompt from the “Building it with your coding agent” section to build the reporting app. Give the prompt to your coding agent and let it generate the app.

Test all five reports. For each, verify that the SQL returns correct data, the chart matches the data, the executive summary references actual numbers, and the PowerPoint has three slides.

If anything is wrong, refine your prompt and regenerate. Keep notes on what you changed and why.

Consider the task: “Prepare the quarterly business review for the CEO.”

List the five to seven steps a human analyst would take to complete this task. For each step, identify what tool the agent would use and what data it would need. Mark which steps need human approval before the agent continues.

Then write a single prompt that describes the full workflow for an agent.

Adapt the reporting pattern to a dataset from your work. Pick a database, spreadsheet, or API that you query regularly.

Write an agent prompt that generates a report from your data — including the SQL (or data-fetching logic), chart type, summary style, and output format. Give the prompt to your coding agent and test the result.

If you cannot use real work data, use a public dataset: the Northwind database, NYC taxi data, or any CSV from your field.

Review your team’s recurring tasks and identify five that are multi-step, repetitive, and data-driven — good candidates for agent automation.

For each, describe the trigger (what starts the task), the steps (what happens), the output (what gets delivered), and the frequency (how often it runs). Rate each on automation potential (high, medium, or low) and risk level (high, medium, or low).

Pick the highest-potential, lowest-risk workflow as your pilot candidate and write one paragraph explaining why you would start there.