A penetration test has a beginning, a middle, and an end. The end is the report. For most clients, it is the only tangible output they receive. Not the methodology. Not the terminal sessions. Not the conversations along the way. The report.
And yet, as an industry, we don’t treat it like a product. We treat it like a formality. A receipt, even.

What the Report Actually Needs to Do
Six months after a test concludes, nobody remembers the debrief call. The tester has moved on to the next engagement. The client’s security team has rotated priorities three times. Staff involved in the test have left and new ones have taken their place. What remains is the PDF sitting on a drive somewhere.
That document needs to work without anyone in the room to explain it. A reader who finishes it without a clear understanding of what to prioritise, how serious the exposure is, or who is responsible for remediation has been left without the one thing the report existed to provide. It produced work without producing clarity.
A report can be comprehensive and still fail. The standard I hold it to is whether it changes what the client does next.
The Noise Problem
There is a persistent belief in this industry that more findings means more value. Clients often internalise this too. The expectation of value becomes directly tied to finding count, and more specifically, to finding severity. Some clients approach a report with an implicit expectation of seeing criticals, highs, and mediums distributed across the page. If the report doesn’t deliver that spread, the firm must be incompetent or the tester a grad. The test must have been inferior.
This is counter-intuitive to the point of being backwards. You should be hoping for a clean report. You don’t gauge a doctor’s competence by how many illnesses they diagnose. A clean bill of health, delivered by a thorough clinician, is the goal.
But that analogy only holds if the clinician actually ran the tests. A clean report from a skilled team with comprehensive coverage, evidenced by detailed testing logs, methodology documentation, and clear articulation of what was tested and how, is the best possible outcome. A clean report from a team that lacked the capability to find what was there is a liability dressed up as reassurance. The report looks the same. The risk doesn’t.
This is why the evidence of work matters as much as the findings themselves. Clients should be asking not just “what did you find?” but “show me what you tested and how deeply you tested it.” A report that can answer both questions is one that can be trusted, regardless of finding count.
Reports should also provide praise where it is due. When input handling is robust, when access controls are well-designed, when a team has clearly invested in getting something right, say so. People appreciate when their work is recognised. A finding-only report implies that the tester’s only job was to find fault. The better framing is that the job was to assess, and assessment includes both sides of the ledger.
That said, informational and low findings have their place, but it needs to be considered deliberately. This is worth discussing before the test begins. Some clients want the full picture regardless of business impact. Others want a tighter document focused on findings with tangible consequences. Both are valid. The conversation just needs to happen.
There is a trade-off worth being transparent about. If a client opts to exclude lower-severity findings and something surfaces later that falls into that category, the question of whether it was missed or deliberately excluded will come up. That decision and the rationale behind it should be documented in the executive summary. Not buried 12 pages deep in the report. Not as an ‘FYI’ in the delivery email. Stated clearly so everyone is aligned and there are no surprises twelve months down the track. Not only does this provide context to future readers, but it helps to protect the firm from being accused of negligence.
Severity Without Context Is Decoration
Many reports use CVSS scores out of the box. CVSS does provide mechanisms for contextual adjustment: temporal metrics account for exploit maturity and remediation availability, while environmental metrics allow scores to be modified based on the target organisation’s specific exposure and requirements. Used fully, it is a reasonably capable tool.
In practice, most reports stop at the base score. The temporal and environmental fields are left undefined, and a number generated from generic exploitability and impact metrics gets treated as a definitive severity rating.
A Critical-rated finding that requires physical network access in an air-gapped facility is not the same conversation as a Critical in an internet-facing API with no authentication. The base score is the same. The risk is not.
There is another dimension that rarely gets addressed: chaining. A finding that rates as a Medium in isolation can be a material contributor to a Critical exploit when combined with one or two other low-severity observations. Most reports treat findings as independent units. Attackers don’t. Where a finding has meaningful chaining potential, its rating should reflect that. In practice this can mean carrying two severity determinations: one for the finding in isolation, and one in the context of the chain, with a clear reference to the other findings involved. It adds complexity to the report, but it more accurately represents the actual risk surface.
Severity ratings should reflect what an attacker can actually achieve in this environment, how realistic exploitation is given the client’s specific context, and what the business consequence is if it goes wrong. Whether that contextualisation happens through CVSS environmental scoring or a separate narrative determination doesn’t matter. What matters is that it happens. A number without that context is a label, not an assessment.
Who is Reading This?
A pentest report often lands in front of multiple audiences simultaneously. The executive needs to understand the investment decision. The engineer needs to know what to build or fix. The security team needs enough detail to make a risk acceptance call if remediation isn’t immediately possible.
These are different documents. Most reports are written for none of them specifically, which means they serve all of them poorly.
The least this requires is a well-constructed executive summary that stands alone: one that doesn’t assume technical fluency, doesn’t bury the lead in methodology, and gives a clear-eyed view of organisational risk without requiring a cipher. The technical findings carry the depth. But the executive summary is not an introduction. It’s the document for the people who will decide what happens next.
Tailoring the level of detail per client matters, too. A team of three developers at a startup and an enterprise with a dedicated security centre are not the same audience. The report shouldn’t pretend they are.
This is something my team is actively working on, because the honest reality is that pentest reporting
hasn’t meaningfully changed since the beginning of the century. It’s been tweaked at the edges and dressed up with better formatting, but the fundamental structure, the core of how it’s done, is the same as it was in 2005. The audiences have changed. The complexity of the environments have changed. The delivery hasn’t kept up.
Presentation Is Not Cosmetic
Over a career spent reading reports from firms of every size and quality level, one thing stands out more than it should: the majority of them are genuinely difficult to look at. Dense walls of uniform text, no visual breathing room, formatting that treats every element the same regardless of importance, and a monotony that makes even a short report feel like a chore.
Aesthetics and design are clearly an afterthought, if they are considered at all. Elements that should complement each other compete instead, and the result is a document that feels assembled rather than crafted. The contents may be technically sound. The experience of reading them is not.
Presentation is severely underrated in this industry. A well-designed report is not vanity. It is a functional decision. When information is laid out clearly, with a visual hierarchy that guides the eye naturally from one element to the next, the reader absorbs it more easily. When it isn’t, the reader works harder, fatigues faster, and retains less. The findings are the same either way. The outcome isn’t.
A poorly presented report says something about the firm that produced it, regardless of the quality of the underlying work. Attention to the reader’s experience is attention to detail. A report that looks like it was assembled in a hurry probably was. One that is clean, consistent, and considered suggests the same care was applied to the test itself.
The report is often the only thing a client has to judge the quality of work they can’t directly observe. Presentation is part of that judgement, whether the firm intends it to be or not.
Remediation Advice that Engineers Can Actually Use
“Apply input validation” is not remediation advice. It is a gesture in the direction of remediation advice. Templated guidance pulled from a library and dropped into a finding without modification is not much better. It tells the reader what category of fix applies, not what to actually do.
If a finding can’t be understood well enough to be acted on, it will sit there. Not because the engineer is incompetent, but because the report hasn’t given them enough to work with. Where’s the vulnerable parameter? What’s the expected behaviour versus the observed behaviour? What does a correct implementation look like in the context of their stack?
The further a finding is from how an engineering team actually works, the less likely it is to get fixed. If the remediation advice doesn’t map cleanly to something that can be written up as a ticket with an owner, you’ve created friction between your output and their process. That friction compounds. Tickets don’t get written. Issues don’t get fixed. The risk stays.
Reports should be educational. Not condescending, not padded with background reading nobody asked for, but genuinely useful to someone who wasn’t in the room. A well-written finding should leave the reader knowing more about their own system than they did before they read it.
The Compliance Infection
A lot of reports have been shaped, consciously or not, by compliance requirements. The priority becomes defensibility: ensuring every box is ticked, every standard referenced, every caveat documented. Covering yourself.
The irony is that the compliance frameworks driving most of this testing actually emphasise usefulness. The goal of a finding under ISO 27001 or the Essential Eight isn’t to document that a vulnerability exists. It’s to drive remediation. The framework is asking for outcomes. The report is delivering paperwork.
Reference links help. A finding that points directly to the relevant OWASP entry, vendor documentation, or framework control gives the reader somewhere to go. It respects their time and extends the usefulness of the report beyond what fits in the body of a finding.
The Real Test
The industry doesn’t study report effectiveness with anything like the rigour it applies to testing methodology. There’s significant literature on attack techniques, tooling, and frameworks. Very little on whether the reports those techniques produce actually result in things getting fixed.
That’s a gap worth taking seriously, because the report is the product. It’s what the client commissioned, even if they didn’t frame it that way. Everything else, the methodology, the expertise, the tooling, exists to produce a document that makes their environment more secure.
If it doesn’t do that, it doesn’t matter how good the test was.


