The model isn't the product: what a CISO should evaluate before buying AI-driven offensive security

For years, buying offensive security was a binary decision. You either hired a team of human pentesters, with everything that means in time and cost, or you settled for an automated scanner that spat out false positives. Black or white. That era is over.

Offensive AI now finds and exploits real vulnerabilities (the kind that get paid out) at a speed and scale that manual pentesting can't match. It has gone from a lab promise to an available capability. And that forces any security leader to rethink not just who they hire, but how they buy offensive security.

If I were on the other side of the table as a CISO, deciding who I'd trust with my company's attack surface, that reality wouldn't lead me to a simple conclusion. It would lead me to a better question.

Two valid paths, not two sides

Today there are at least two legitimate options, and it's worth not confusing them.

The first are traditional pentest firms that boost their pentesters with AI tools and licenses for models like Claude. This isn't empty marketing: a good pentester with a copilot like Claude reasons better about code, responses, and hypotheses, and gains on the two things that historically escaped manual pentesting: time and coverage. They cover more surface in fewer days. It's a genuine improvement on the classic model.

The second option isn't to hand a model to a human and trust their judgment, but to build an architecture and a value model around the LLM: where the model isn't the product, but one more component, governed by a system that surrounds it. This is Strike's approach, but the underlying distinction isn't in saying that AI is used, but in where control lives.

They aren't on opposing sides. They are different paths. And this doesn't invalidate the work of the AI-augmented pentester; in fact, the best systems combine humans, models, and controls. The question isn't "who uses more AI" (both do), but where the operational guarantee lives.

What "equipping a pentester with Claude" looks like today: skills

It's worth grounding what it means, in 2026 practice, to boost a pentester with a tool like Claude. Today it means, above all, skills: modules that package instructions, scripts, and knowledge for a specific task. Reconnaissance, testing for a type of vulnerability, generating a report. It's how the sector is extending models, and it's genuinely good: it encapsulates repeatable experience and puts it one command away.

The key is in how the model decides which skill to use. Claude doesn't load every skill at once; it uses progressive disclosure. At startup it only sees the name and description of each skill (barely a hundred tokens per skill). With that metadata, and nothing more, it judges whether a skill is relevant to the task. Only when it "believes" it applies does it load the body of instructions; and only then, the scripts and files that skill references. It's elegant: it keeps context light and scales to hundreds of capabilities.

But that's exactly where the crack is. That initial decision of does this skill apply to what I'm doing? is a model's judgment about a one-line description, not a deterministic rule. With few skills it works well. With many, the descriptions compete and overlap, and the model may not trigger the right skill, trigger the wrong one, or simply overlook it. And it does so silently: there's no exception, no log saying "I ignored the skill that tested IDOR in the body." Worse still, as context accumulates the fidelity with which the model follows each instruction degrades (so-called context rot): the information is still "there," but it weighs less.

For a pentest, that crack is precisely the dangerous one. The model skipping the skill that validated a vector, or the control that bounded the scope, doesn't produce an obvious failure: it produces an incomplete test that looks complete. And that's where the underlying question appears.

The question that really matters: who controls the flow?

When a significant part of operational control is delegated to the model, the prompt, or natural-language instructions, a fragility appears that a CISO should look at carefully. It's not a problem of the pentester's talent or which model is chosen: it's a matter of where the guarantee resides.

How do you ensure the LLM doesn't step outside the authorized scope? That it doesn't exceed the requests-per-second rate negotiated with the client? That it doesn't trigger a destructive action on a resource it shouldn't have touched? If the answer to all of that lives inside natural-language instructions, then, when something goes wrong, the uncomfortable question is: how do you debug a prompt error?

You don't debug it the way you debug code. A language model is non-deterministic by design. The same prompt can behave differently twice. There's no breakpoint, no stack trace, no test that reproduces the failure reliably. In most disciplines that's an inconvenience. In pentesting, where a single out-of-scope request or a traffic spike can take down a client's production system or an entire team's connectivity over one badly propagated variable, it's a serious operational risk.

The model isn't the product: the architecture is

This is where moving control outside the model changes the equation. The core idea is deliberately unglamorous: surround the model with a deterministic architecture, and use the LLM only where it adds real value —reasoning, interpreting, prioritizing— not as the pilot that makes every decision.

In practice, that means a staged process where each step runs tools in a fixed, reproducible, testable order, and the model comes in as a perceiver that interprets results, not as a loose agent that decides what to attack and how. The scope isn't "asked of" the model in a paragraph: it's validated at a deterministic gate before a single request fires. The request rate doesn't depend on the LLM "remembering" not to exceed it: the system enforces it. Destructive actions aren't left to the prompt's discretion: they're protected by guards in the code.

The practical rule is simple: operational control shouldn't live in the model's head. An agent shouldn't remember the scope; it should consult a signed policy before each request. It shouldn't try not to make too many requests; it should have a rate limiter external to the model. It shouldn't decide whether a finding is reportable; it should pass through a reproducible validator.

And when human judgment is irreplaceable, the system doesn't improvise: it hands off control at explicit handoff points, does 95% of the work automatically, and lets the person touch only the 5% where the model underperforms. Precision over apparent autonomy.

That mix of deterministic architecture plus human judgment at the edges is what makes it possible to separate reasoning into auditable stages and confirm each finding with a deterministic validator and a reproducible exploitation proof before reporting it.

The architecture in practice: encoding operational know-how

But a solid architecture, on its own, isn't enough either. In offensive security, flow control only has value if it's built on real subject-matter knowledge. And there's an important difference here: Strike isn't trying to build AI offensive security from a blank page, or from a set of isolated prompts. It's building on years of experience doing pentesting in the market, with thousands of hours of testing across very different companies, applications, and contexts: banking, fintech, e-commerce, retail, SaaS, healthcare, energy, telecommunications, and internal platforms.

That track record gives the system something a model alone doesn't have: direction. Behind every architecture decision there's a hacking team with experience in exploitation, validation, triage, reporting, and discussing impact with real clients; and engineering and AI teams analyzing executions, branches, and results to improve the agent continuously.

All of that lives inside a platform that sustains the operation: it orders executions, centralizes evidence, enables traceability, connects the agent's work with human judgment, and helps findings reach the client with context, validation, and control.

That intersection of hacking skills, security engineering, prompt engineering, AI, platform, and operational experience is what turns architecture into product. It's not just about automating offensive tasks; it's about encoding Strike's accumulated know-how into rules, validators, flows, and criteria that the system can repeat with control.

What, as a CISO, I'd have to guarantee to my business

This is what ultimately tips the scale. Errors are going to happen (no honest vendor promises otherwise). The difference is that with a deterministic architecture around the model I know where the agent can fail and where the human fails, and that's exactly what I can pass on as a guarantee to my business:

Scope: it doesn't break, because it's validated in code before each action, not in a paragraph of instructions.

Load: the request rate is enforced by the architecture, not the model's memory.

Reproducibility: a finding comes with a proof I can re-run; an error comes with a point in the flow I can inspect and fix.

Decisions based on each application's architecture: SPAs, APIs, GraphQL, different authentication schemes (each with its own rules) are handled with explicit logic, not with the hope that the model "gets it."

An AI-boosted pentester is, without a doubt, better than a pentester without it. The difference isn't in who uses AI, but in where the guarantee lives: whether in a person's judgment trusting a non-deterministic box, or in an architecture that surrounds the model. The latter gives me something different: flow control at every stage of the pentest, traceability when something breaks, and fixed rules that don't depend on the model having a good day.

That's why, if I were the CISO, I'd lean toward a vendor built on this philosophy. Not for using AI, but for an underlying idea: the next generation of offensive security isn't won with better prompts, but with architectures that turn probabilistic reasoning into controlled, auditable, reproducible processes. In offensive security, the model can't be the product. The product is control.

‍

Aldo Chabur

Por Aldo Chabur, Sr Pentest Operations Specialist

Last Posts

Trends

OpenAI and Hugging Face got hit: a model exploited a real vulnerability just to win a benchmark

An AI model found a zero-day, escalated privileges, and moved laterally through production systems at two of the world's most advanced AI companies. No one asked it to. It was simply trying to solve a benchmark.

Shadow AI: The hidden threat growing inside your organization

Shadow AI is becoming one of the biggest cybersecurity risks in 2026. Learn what shadow AI is, why it expands your attack surface, and how to manage it effectively.

5 AI security rules to protect your model from attacks

AI systems are being shipped faster than they’re being secured—and attackers know it. From prompt injection in LLMs to model theft via public APIs, the threats are real and growing. This blog breaks down the AI security best practices and secure development techniques you need to apply from day one to protect your models, your data, and your reputation.

Get offensive security insights, straight to your inbox

OpenAI and Hugging Face got hit: a model exploited a real vulnerability just to win a benchmark

Shadow AI: The hidden threat growing inside your organization

5 AI security rules to protect your model from attacks