Writing the AI Playbook: How I Built Security Guidelines That Engineers Actually Follow

I wasn't planning to write an AI security policy.

Our company already had data privacy policies and a reasonable "use common sense" approach to AI tooling. That's fine - until common sense hasn't caught up to what the tools can actually do.

The moment that changed my thinking was watching someone demo an agentic browser. They showed it navigating GitHub, reading code, figuring things out autonomously. It was impressive. It was also logged into company accounts. And I wasn't the only one who noticed - my manager and one of my team members clocked it at the same time: nobody had thought through what this means.

We talk a lot about prompt injection - the adversarial case where someone deliberately tricks an AI agent. That's a real risk. But there's an equally important one that gets less attention: agent misunderstanding. An AI tool doing something perfectly reasonable but wrong, with access to things it probably shouldn't have. No malice required. Just a well-intentioned agent, too many permissions, and data flowing in directions nobody anticipated.

Why "Use Common Sense" Isn't Enough

The problem with AI tooling isn't that people are careless. It's that the risk patterns are genuinely new, and they aren't intuitive yet.

If you've been working in software for a decade, you have good instincts about credentials, about what's sensitive, about what belongs where. But AI tools introduce a category of risk that those instincts don't cover - especially as we move from simple chat assistants to agents that can take actions across multiple systems.

I started thinking about these risks in layers:

AI Tool Risk Spectrum

Obvious risksSubtle & dangerous

🔑

Basic HygieneLayer 1

Things you put into prompts

🔧

Tool-Level AccessLayer 2

What you let AI tools do

🕸️

Compositional RisksLayer 3

When safe tools combine dangerously

The first layer is obvious. Don't paste your database password into ChatGPT. Most engineers already know this, and a "use common sense" policy covers it fine.

The second layer is where things get interesting. Agentic browsers that are logged into your company accounts can read anything you can read. Desktop AI tools with "Computer Use" permissions can see your screen, move your mouse, and type on your keyboard. These are powerful capabilities with huge blast radii, and the implications aren't obvious unless you stop and think about them.

The third layer is subtler. Even when you're using approved tools, individually safe tools can combine into something dangerous. Two MCPs that are each fine on their own - say, an HR system connector and a Slack connector - create an unintended data path when they're both available to the same agent. Salary data that's properly locked down in your HR system can suddenly flow into a Slack channel because the agent was trying to be helpful.

This is the stuff that "common sense" doesn't cover, because it's not common knowledge yet.

What the Guidelines Actually Cover

I won't reproduce the full document here, but the structure followed the risk layers above. A few of the key principles:

Approved tools only. This sounds obvious, but the specifics matter. Tools with free plans aren't approved by definition - if there's no enterprise agreement, there's no data handling guarantee. And you use them through company accounts, not personal ones.

Treat every prompt as company data. No PII unless the tool is specifically designed for it. No secrets, ever. This applies even when the tool is approved - the issue isn't whether you trust the tool, it's whether the prompt might end up somewhere unexpected.

MCP connections are allowed, but controlled. Only official MCPs from the vendor itself. No third-party MCP servers. And critically: be the human in the loop. Approve each action, especially when the agent has access to multiple MCPs. The combination is where the risk hides.

Agentic browsers are strongly discouraged. If you must use one, don't be logged into accounts with access to PII or proprietary data. The data exfiltration surface area is just too large.

No "Computer Use" permissions. Full stop. Giving an agent control of your desktop means it can access anything you can access - your password manager, your email, your local files. The risk profile is extreme.

The guidelines shouldn't stifle innovation either. Engineers need time to explore new AI tools and recommend them for broader use. The guardrails for experimentation are simple - no company data, no company accounts - so people can try things safely and bring the best tools forward through the approval process.

Making It Stick

Writing the guidelines was the easy part. The hard part was getting people to actually internalise them.

I presented the guidelines to engineering, product, and design - a mixed audience with very different levels of AI literacy. Standing up and walking through a policy document is a surefire way to watch a room check their phones. So I tried a different approach.

I leaned into humour and personality. I had a voxel-art version of me (generated by Leonardo.Ai, naturally) as a recurring character in the slides. I used real-world examples rather than abstract threat descriptions. And I built a tool.

I used Claude Code to build an interactive scenario visualiser - a step-through walkthrough that shows exactly how an agent interacts with tools, where data flows, and where things go wrong. You pick a scenario, step through it click by click, and watch the danger unfold in real time.

Here's a simplified version of that visualiser. Try stepping through each scenario:

Interactive Scenario

Credentials in a prompt get posted to a public channel

YouAI AgentSlack MCP

Click “Next Step” to walk through the scenario

5 steps

The orthogonal MCP scenario was the one that landed hardest. People could immediately see how two tools they were already using - individually safe, individually approved - could combine into a real problem. That "oh, I hadn't thought of that" reaction was exactly what I was aiming for.

How I Know It Worked

The real signal wasn't the positive reception in the presentation (though that helped). It was what happened afterward in Slack.

People started correcting each other. Someone would mention using a tool in a way that conflicted with the guidelines, and another team member would chime in with a reference to the policy. I saw it in the AI tools guild I run, in team channels, across engineering generally. That kind of peer enforcement is worth more than any top-down mandate. It means the concepts actually landed - people understood why the rules exist, not just what they are.

The guidelines are a living document, though the core principles haven't changed much. The approved tool list has evolved as new tools get identified and evaluated, though the core set has remained stable. The framework was intentionally durable: it's organised around access patterns and risk categories rather than specific tool names, so it doesn't need rewriting every time a new AI product launches.

Why It's Worth Doing

If you're an engineering manager thinking "we should probably have AI guidelines but it feels like overhead" - just bite the bullet.

This is such a new space, with such wide-open access patterns, that guidelines are needed here more than in almost any other area of engineering practice. The cost of not having them isn't a spectacular security breach (though it could be). It's a slow accumulation of bad habits: credentials in prompts, unapproved tools with company data, agents with permissions nobody audited.

The good news is that the guidelines don't need to be complicated. They need to be clear, they need real examples, and they need to make the invisible risks visible. If your team walks away understanding why two safe MCPs can be dangerous together, you've done most of the work.

Build the playbook. Present it with real scenarios. Make it a living document. And then watch your team start policing it themselves.

This post is part of a series on building AI-augmented engineering teams. Next up: how I took a native engineering team from zero to agentic.