What Makes a Pen Test Fail?
Common Penetration Testing Mistakes to Avoid

22 June 2026

Joe Beauchamp

Blog

Cyber Security

Penetration Testing

Security Testing

A penetration test rarely fails in the way people imagine. The testing happens, the findings are legitimate and the report is delivered, but none of that guarantees the organisation is any more secure for it. Many organisations that commission penetration tests regularly still find themselves asking the same question a year later: why does security not seem to be improving?

A test can be technically excellent and still fail to make an organisation more secure. That may seem counterintuitive, but it’s one of the most important aspects a buyer of penetration testing can understand. When it happens, the reasons are often organisational rather than technical – found in the conditions around the engagement, such as the scope it was given, the asset inventory it was built on, the ownership of what it identified and the mindset that commissioned it in the first place.

This article is about those conditions and what happens when they are overlooked. It is written for people who regularly commission penetration tests and have started to suspect that their programme is not delivering the improvements they expected: the same findings keep reappearing, the estate has grown faster than the scope, reports are filed, auditors are satisfied and yet the security posture feels no stronger.

None of that is a testing problem. It is a programme problem – and programme problems are fixable.

Out of scope, not out of reach

A penetration test only examines what it is pointed at. That is not a flaw in testing; it is the definition of it. Every engagement has a boundary and everything beyond that boundary remains untested. Issues arise when that boundary is drawn from an incomplete picture of the estate. Systems missing from the inventory are unlikely to appear in the scope but remain available to attackers.

Most organisations do not document everything they own. Shadow IT, Cloud sprawl, infrastructure inherited through acquisition and ageing legacy systems all expand the real attack surface faster than inventories can be updated. The CIS Critical Security Controls place inventory and control of enterprise assets at Control 1 for a simple reason: an organisation cannot defend or test what it does not know exists. A poor asset inventory is not a scoping oversight that a better brief can fix. It is a structural problem – a tester working from a list can only assess what appears on that list.

The size of the blind spot is easy to underestimate. IBM’s 2025 Cost of a Data Breach report found that the average breach took 241 days to identify and contain, while more than a third involved shadow data stored in unmanaged or unknown locations. An organisation tested in January can be compromised in February through an asset nobody included, then spend most of the year unaware of it. A testing programme built on last quarter’s view of the estate is already working with stale information.

Public breaches illustrate the point more clearly than any statistic. When Equifax was breached in 2017, the entry point was a known, patchable vulnerability in an internet-facing system that existing asset and patch management processes had failed to address. The weakness was not sophisticated. It existed on a system that had effectively fallen through the cracks. More recently, the 2025 attack on Marks & Spencer came through a third-party IT helpdesk rather than the retailer’s own hardened environment. Attackers obtained access through a trusted supplier, bypassing assumptions about where the organisation’s security boundary sat. The lesson in both cases is the same: the real attack surface is almost always larger than the environment being tested. Out-of-scope assets remain firmly in scope for attackers.

This is also where the relationship between tester and client matters most. Good scoping is a conversation, not a form. It requires input from the people who run the systems, not just the security team commissioning the engagement, because they are the ones who know about the forgotten subdomain, the legacy server or the new integration that quietly exposes an internal service.

When that conversation does not happen, two things go wrong. First, important systems are missed. Second, the engagement itself becomes less effective. Testers arrive to find that system owners were never briefed, credentials are unavailable, environments cannot be accessed and nobody is available to answer questions within the testing window. Time spent chasing access is time not spent testing and the coverage paid for quietly shrinks.

Underneath that friction is often a misunderstanding worth addressing directly. When access is slow to materialise, a common response is a version of “you’re the hacker, just hack your way in”. It is an understandable instinct, but it misrepresents what a commissioned penetration test is designed to do.

A real adversary operates without an agreed scope, rules of engagement, legal constraints or a deadline. A penetration test operates under entirely different conditions. It has a fixed window, an agreed scope, a duty not to disrupt production systems and legal obligations that both parties must respect.

The ownership gap

If scoping is the most underestimated cause of a failed penetration test, remediation ownership is probably the most common. A penetration test produces findings and recommendations, not fixes. The report is the beginning of the work, not the end of it, and a finding that is not assigned to a named owner with a deadline is unlikely to be addressed. Instead, it becomes another item on a growing list of known problems.

Industry data for 2025 puts the average time to remediate a critical application vulnerability at around 74 days, while attackers typically need only a matter of days to exploit a usable weakness. That gap is the window organisations leave open. Worse, large enterprises still leave approximately 45% of discovered vulnerabilities unresolved a full year after they are identified. In many cases, the organisation knows about the problem long before it deals with it.

The reasons are rarely malicious and almost always organisational. For example:

Findings arrive in a shared inbox with no individual accountable for closing them.
Development teams are measured on feature delivery, so security tickets drift down the backlog.
Infrastructure teams enter change-freeze periods and fixes wait for maintenance windows that never quite arrive.

The result is that vulnerabilities remain open long after they have been identified, while reports quietly turn into shelfware.

This is where many organisations misunderstand the role of the testing provider. The NCSC (National Cyber Security Centre) rightly treats risk assessment and remediation as business processes rather than testing activities. Testers can identify weaknesses, explain their impact and recommend corrective action, but they do not own the risk. Decisions about what to fix, in what order and by when belong to the organisation itself.

That accountability needs to exist before testing begins, not be improvised after the report lands.

Effective programmes establish remediation ownership in advance, with clear deadlines and a defined process for verifying closure. Findings with clear accountability tend to get resolved. Findings left to work their way through the organisation usually do not.

Are you testing for security or compliance?

Many penetration tests are commissioned to satisfy a framework requirement. The PCI DSS (Payment Card Industry Data Security Standard), ISO 27001, Cyber Essentials Plus, and SOC 2 all include testing or assessment obligations and, for many organisations, meeting those requirements is mandatory. Compliance is a perfectly legitimate reason to commission a test. The problem begins when passing the assessment becomes the primary objective.

Every framework defines a minimum standard. The PCI DSS, for example, requires regular internal and external penetration testing of the cardholder data environment, but its purpose is to identify and remediate exploitable weaknesses. The requirement exists to improve security, not simply to produce evidence for an auditor.

Problems emerge when organisations allow audit boundaries to dictate where testing begins and ends. A business can meet every requirement of an assessment and still leave a significant part of its environment untouched simply because it falls outside the certified scope. An ISO 27001 certificate or Cyber Essentials Plus pass demonstrate that a defined standard was achieved a particular point in time, but it not that the organisation examined the risk beyond the framework’s minimum requirements. In that sense, passing an audit can create a reassuring picture of security while leaving important questions unanswered.

Attackers have no reason to respect the boundaries that auditors create. A weakness outside the certified scope is no less valuable to an adversary than one inside it and access gained through an overlooked system can be just as damaging as access gained through a business-critical application.

Compliance frameworks are designed to help establish a minimum standard of security, not to define the limits of an organisation’s risk management effort. When organisations mistake that minimum for a complete picture of exposure, penetration testing becomes focused on satisfying requirements rather than understanding risk.

Copy. Paste. Test. Repeat.

This pattern describes more penetration testing programmes than many organisations would care to admit. The engagement is commissioned because it is time to commission it, the scope is lifted from the previous statement of work and the resulting report is reviewed just long enough to satisfy immediate requirements being filed away. Over time, the exercise becomes less about understanding risk and more about repeating a familiar process.

The copied scope deserves particular attention because it is where tick box testing often does its quietest damage. A common request is simply to repeat last year’s engagement using the previous statement of work as the starting point. On the surface that looks efficient. In practice, it creates drift.

One year reuses the previous scope. The following year reuses that one. Before long, the engagement is being built around a description of the environment that is several years old, testing systems that have since been retired, missing services that have since been introduced and referring to products or versions that no longer exist. When the environment changes continuously and the scope does not, what looks like consistency is an increasingly inaccurate view of the organisation’s real attack surface.

The same issue appears in timing as well as scope. Annual testing is not inherently a problem and for many organisations remains entirely appropriate. The problem arises when testing frequency is disconnected from the rate of change in the environment. In organisations deploying regularly through Cloud platforms and automated delivery pipelines, a yearly assessment may capture only a brief snapshot of a rapidly evolving estate.

The wider industry has started to push back against the weakest forms of tick-box testing. Under PCI DSS v4.0.1, a report that is little more than the output of an automated scanner presented as a penetration test will not satisfy assessment requirements. The Standard now expects documented methodology and evidence of genuine manual testing. That is a welcome development because tick-box testing optimises for the appearance of assurance rather than assurance itself.

The underlying problem is not how often testing happens but whether the programme continues to reflect reality. A test repeated out of habit will usually produce exactly what habit produces: the same assumptions, the same blind spots and the same unanswered questions.

The vulnerability you already knew about

If there is one indicator that a penetration testing programme is underperforming, it is the same vulnerabilities appearing in successive reports. When that happens, the instinct is sometimes to ask why the tester keeps finding the same things. That is the wrong question. The right one is why those things were never fixed.

Recurring findings are not a testing failure. They are evidence that known weaknesses remain unresolved. The test is doing exactly what it is supposed to do by identifying and reporting genuine risks. When the same issue appears year after year, it suggests the organisation already understood the problem but failed to address it.

The industry increasingly refers to this as security debt: vulnerabilities that remain open long after they have been identified. Veracode’s 2026 State of Software Security report found that 82% of organisations carry security debt, while 60% carry critical security debt – high-severity flaws left unresolved for more than a year. The average time taken to remediate vulnerabilities has also increased significantly over the past five years. Known weaknesses do not disappear with age. Left unresolved, they accumulate.

The challenge is compounded by the volume of new vulnerabilities entering the system. Tens of thousands of new CVEs are disclosed every year and no organisation can remediate everything immediately. That makes prioritisation more important, not less. A backlog that is never reduced does not remain static. It grows.

This is why recurring findings are such a useful measure of programme effectiveness. A penetration test will always find something. What matters is whether it keeps finding the same things. When issues stop reappearing, it is usually a sign that the programme around the test is working as intended.

Security is what happens after the test

All of this is an argument for treating penetration testing as part of a wider security process. The common thread across every failure above is the same: a test succeeds when it feeds that process and fails when it is used simply to close a ticket.

In practice, a programme that works tends to get a few basic things right:

It begins with a genuine pre-engagement scoping discussion, one that draws on the people who run the systems and validates the asset inventory against reality rather than memory.
It defines remediation ownership before testing starts, so that findings have somewhere to go the moment they arrive.
It includes a retest as part of the engagement, not as an optional extra, so that “fixed” can be proven rather than assumed.
It tracks findings to closure, so that the programme can see its own progress over time.

Choosing a provider whose work carries a recognised quality signal helps too: CHECK penetration testing, for example, gives clients a credible assurance of methodology and competence when they are comparing options on more than price.

If any of this sounds like a description of a programme you do not quite have, that is worth sitting with. The test itself is the part most organisations focus on, and it is rarely where the value is won or lost. The scoping, the ownership, and the follow-through are where security posture is built, and they are the parts worth getting right before the next engagement, not after it. If you would like to talk through whether your current approach has those conditions in place, our penetration testing services team is happy to help.

GRC Solutions penetration testing services

Our penetration testing services are delivered by CREST-accredited and CHECK-approved consultants like Joe, providing assurance that your testing is carried out to recognised UK and international standards.

Learn more

About the author

Joe Beauchamp is a senior security consultant with around ten years in the industry. Primarily an application tester, with broader experience across Cloud configuration reviews, AI, infrastructure and endpoint work, he holds the CREST Certified Tester (Application) and OSCP qualifications.

Back to resources

What Makes a Pen Test Fail? Common Penetration Testing Mistakes to Avoid