Software testing


Test planning & management

The process of planning, estimating, monitoring, and controlling test activities, documented in a (risk‑based) test plan, strategy or policy, to achieve defined quality objectives within the project’s constraints of scope, time, and resources.

Maturity Levels

Level

Name

Description

Technology

Example tools

0 Non-Existent There is no AI assistance, automation, or data integration of any kind. Corporate guidelines or governance for AI‑enabled workflows and tool usage are absent. Test policy, strategy, plan creation, estimation, progress tracking and reporting are fully manual; no AI evaluates readability, maintainability, or explainability of artefacts. None None
1 One-Off Assist Test managers/coordinators occasionally ask an LLM for draft strategy text, workload estimates, or risk heat‑maps and paste results into documents; nothing is version‑controlled, results vary widely between individuals and are difficult to reproduce or scale.

Natural-language draft generation with Off-the-shelf LLM & prompt engineering

  • LLM chatbots / answer engines
2 Integrated Assist

AI is embedded in QA tooling process, providing  suggestions for test‑policy clauses, strategy sections, resource/timeline forecasts, and risk heat‑maps. Artefacts are version‑controlled with project deliverables. AI flags readability, maintainability, and explainability issues.

AI Agents / Autonomous Agents + LLMOps (prompt / template management, deployment, guardrails)

3 AI-Human Collaboration

AI agents act as junior test managers, digesting code, requirements, and trends to suggest strategies, scope, team updates. Every recommendation is traceable and explainable. and subject to human review.

  • Agentic frameworks
  • advanced RAG
  • orchestration
  • LLMOps (prompt / template management, deployment, guardrails)
  • Deep research
  • Agentic frameworks
    • Langgraph as orchestrator, vector DBs as knowledge, prompts and a model for interaction
  • Deep research
    • Deep research on
    • chatGPT/Gemini/Claude using a reasoning model, Perplexity labs,  Manus, …
4 Full Autonomy

Autonomous agents/AI systems create and update test policy, strategies and plans from live data. Projects are managed dynamically; and AI handles scope, milestones, KPIs. Human involvement is confined to strategic governance, on demand, the autonomous AI must supply a transparent, traceable explanation of its actions, input data, and decision rationale.

Autonomous agents, causal-inference models, continual learning, LLMOps pipelines End-to-end QA orchestrators (no knowledge of tools who operate at this level)
  AI Maturity Level: Indicates the level the technology vendors claim to have reached in deploying AI solutions that actually work in real-world applications

Test analyse & design

The process of analysing the test basis and transforming it into test conditions, test cases, and test data using appropriate test design techniques to achieve required coverage and mitigate quality risks.

Maturity Levels

Level

Name

Description

Technology

Example tools

0 Non-Existent

All analysis and design tasks are fully manual. No AI, automation, or review for quality attributes like readability or explainability.

None None
1 One-Off Assist

Test engineers use LLMs ad hoc to draft test cases or choose techniques. Prompts vary by user with no standards, reuse, or traceability. Results are inconsistent and unscalable.

Off-the-shelf LLM & prompt engineering

2 Integrated Assist

AI assists with static oracle checks, structured case generation, and artefact review for quality. Work aligns with prompt standards and a feedback loop. Nearly full task support, enabling near end-to-end coverage with minimal manual effort.

The AI system can review human‑created test artefacts for correctness, completeness, readability, maintainability, and explainability, flagging gaps or duplicates before peer review.

  • AI Agents / Autonomous Agents
  • LLMOps (prompt / template management, deployment, guardrails)
3 AI‑Human Collaboration

AI acts as a junior test analyst: analysing multimodal input and past defects to refine test oracles, recommend techniques, and generate test artefacts, including transparent explanation of its reasoning, while a human overseer guides and refines its output.

  • Agentic frameworks
  • advanced RAG
  • orchestration
  • LLMOps (prompt / template management, deployment, guardrails)
  • Agentic frameworks,
  • code & UI embeddings
4 Full Autonomy

Autonomous AI designs and maintains test suites, detects oracle issues, and regenerates impacted assets when input changes. Human involvement is confined to strategic governance, all actions are explainable

Autonomous agents, model‑based testing, continual learning pipelines End‑to‑end test‑design orchestrators (no knowledge of tools who operate at his level)

*** Although many tools claim to operate at AI Maturity Level 3, these claims are often exaggerated, they typically require significant manual effort, lack true context awareness, and rely heavily on marketing buzzwords like “self-healing tests,” “autonomous agents,” “AI-driven quality,” “zero-touch automation,” “intelligent test orchestration,” and “continuous risk-based optimization.” In truth, most of these tools work more like Level 2, they help people but don’t really work alongside them. They still need detailed prompts, human guidance, and corrections to get good results. That said, some tools are starting to explore real Level 3 features. Early versions show potential. Progress is slow but steady, with better context awareness and more independence pushing things forward.

  AI Maturity Level: Indicates the level the technology vendors claim to have reached in deploying AI solutions that actually work in real-world applications

Test implementation, automation & test data generation

The phases of finalising testware by developing, maintaining, and automating executable test scripts, harnesses, and representative test data to enable efficient, repeatable, and scalable test execution.

Maturity Levels

Level

Name

Description

Technology

Example tools

0 Non-Existent

All test scripts and data are created and maintained manually; no AI assistance is used.

None None
1 One-Off Assist

Engineers prompt an LLM to generate a skeleton script, a SQL dataset, or a simple page‑object and then refine manually., there are no corporate guidelines, shared prompt libraries, or optimisation practices, so results vary widely between individuals and are difficult to reproduce or scale.

  • Off-the-shelf LLM & prompt engineering
  • code completers in IDE’s

2

Integrated Assist

AI in IDEs/frameworks generates maintainable code, test data, and converts test cases to scripts based on human input. Integrated prompt standards and feedback loops ensure consistent, scalable results.

  • LLM + code embeddings, s, test‑data synthesis libs
  • MCP
  • AI Agents / Autonomous Agents
  • LLMOps (prompt / template management, deployment, guardrails)
3 AI‑Human Collaboration

AI agent(s)/system(s) acts as an entry-level test automator. The AI system is fully context‑aware of the project: it implements tests for new requirements, refactors and optimises the automation suite, synthesises sophisticated test data (synthetic or privacy‑masked), and flags redundant scripts, always with human experts supervising and validating its output.

  • Agentic frameworks
  • advanced RAG
  • orchestration
  • LLMOps (prompt / template management, deployment, guardrails)
  • MCP
  • Agentic frameworks
    • Langgraph as orchestrator, vector DBs as knowledge, prompts and a model for interaction
  • IDE’s with AI capabilities
  • CLI code assistants
  • MCPs
  • Off-the-shelf tools***
    • Cypress cy.prompt, coTestPilot for testers, coTestPilot for developers, TestZeus-Hercules, Magic Inspector, Wopee, Katalon, Applitools, UIPath, Testers.ai, …
4 Full Autonomy

Autonomous AI maintains scripts and data, migrates frameworks, manages test infrastructure, generates mocks/stubs, and runs test sets unsupervised. Human involvement is confined to strategic governance, on demand, the autonomous AI must supply a transparent, traceable explanation of its actions, input data, and decision rationale.

Autonomous agents, self‑healing AI, continual learning pipelines
  • End‑to‑end automation orchestrators (no knowledge of tools who operate at his level)

*** Although many tools claim to operate at AI Maturity Level 3, these claims are often exaggerated, they typically require significant manual effort, lack true context awareness, and rely heavily on marketing buzzwords like “self-healing tests,” “autonomous agents,” “AI-driven quality,” “zero-touch automation,” “intelligent test orchestration,” and “continuous risk-based optimization.” In truth, most of these tools work more like Level 2, they help people but don’t really work alongside them. They still need detailed prompts, human guidance, and corrections to get good results. That said, some tools are starting to explore real Level 3 features. Early versions show potential. Progress is slow but steady, with better context awareness and more independence pushing things forward.

  AI Maturity Level: Indicates the level the technology vendors claim to have reached in deploying AI solutions that actually work in real-world applications

Test execution

The activity of running test suites (groups/folders/sets of test cases/scenario’s/scripts), comparing actual and expected outcomes, logging incidents, and collecting metrics in the designated environment.

Maturity Levels

Level

Name

Description

Technology

Example tools

0 Non-Existent

All tests are executed manually/automated with no AI support or AI enhanced automation; results are logged by hand.

None None
1 One-Off Assist

Testers occasionally use an LLM to auto‑generate a command‑line or interpret a log snippet to speed up manual execution, results vary widely between individuals and are difficult to reproduce or scale.

  • Off-the-shelf LLM & prompt engineering
  • MCPs on off-the shelf LLMs
2 Integrated Assist

AI is built into execution frameworks or processed to schedule suites, classify failures in real‑time dashboards and execute tests based on high level natural language descriptions.

  • LLM + code embeddings, s, test‑data synthesis libs
  • MCP
  • AI Agents / Autonomous Agents
  • LLMOps (prompt / template management, deployment, guardrails)
3 AI‑Human Collaboration

AI agent(s)/system(s) acts as an entry level tester. who starts and monitors live runs, predicts remaining duration, suggests selective re‑runs, applies self‑healing and surfaces likely root causes for failed steps. While a human overseer guides and refines its output. It can also execute exploratory test flows based on high‑level natural‑language quality requests delivering summarised findings for human validation.

  • Agentic frameworks
  • orchestration
  • LLM + code embeddings, s, test‑data synthesis libs
  • MCP
  • AI Agents / Autonomous 
  • LLMOps (prompt / template management, deployment, guardrails)
  • Agentic frameworks
    • Langgraph as orchestrator, vector DBs as knowledge, prompts and a model for interaction
  • IDE’s with AI capabilities
  • CLI code assistants
  • MCPs
  • Off-the-shelf tools***
    • coTestPilot for testers, coTestPilot for developers, TestZeus-Hercules, Magic Inspector, Wopee, Katalon, Applitools, UIPath, Testim, testers.ai, …
4 Full Autonomy

Autonomous execution agents provision environments, orchestrate parallelisation, self‑heal UI/API/… tests, run canary, chaos and other unsupervised experiments, continually optimising coverage, cost, and risk without hands‑on support. 

They also execute tests on the items under test, and can autonomously detect the need for and execute functional and non‑functional exploratory test flows from high‑level natural‑language quality objectives,

Human interaction is limited to high‑level goal setting and periodic governance reviews, though the system remains available for on‑demand unsupervised natural‑language test runs. On demand, the autonomous AI must supply a transparent, traceable explanation of its actions, input data, and decision rationale.

Autonomous agents, reinforcement scheduling, chaos‑engineering AI
  • End‑to‑end execution orchestrators (no knowledge of tools who operate at his level)

*** Although many tools claim to operate at AI Maturity Level 3, these claims are often exaggerated, they typically require significant manual effort, lack true context awareness, and rely heavily on marketing buzzwords like “self-healing tests,” “autonomous agents,” “AI-driven quality,” “zero-touch automation,” “intelligent test orchestration,” and “continuous risk-based optimization.” In truth, most of these tools work more like Level 2, they help people but don’t really work alongside them. They still need detailed prompts, human guidance, and corrections to get good results. That said, some tools are starting to explore real Level 3 features. Early versions show potential. Progress is slow but steady, with better context awareness and more independence pushing things forward.

  AI Maturity Level: Indicates the level the technology vendors claim to have reached in deploying AI solutions that actually work in real-world applications

Evaluating exit criteria & reporting

The activity of comparing actual test results and coverage to predefined exit criteria and producing concise, meaningful reports for stakeholders on product quality and residual risk.

Maturity Levels

Level

Name

Description

Technology

Example tools

0 Non-Existent

Exit criteria are evaluated manually. Reports are crafted by hand with no AI  support. No quality analysis assistance is present.

None None
1 One-Off Assist

Test analyst prompt an LLM with natural‑language summary generation capabilities  summarise results into prose or create a simple chart for a one‑time release note. Prompts are improvised, with no standards or reusability.

  • Off-the-shelf LLM & prompt engineering
  • MCPs on off-the shelf LLMs
  • code completers in IDE’s
2 Integrated Assist

Standardised AI tasks are available and AI is integrated into tools to assist with KPI aggregation, script documentation, case-to-script mapping, and readiness scoring. Outputs are consistent and versioned. 

  • LLM + code embeddings, s, test‑data synthesis libs
  • MCP
  • AI Agents / Autonomous Agents
  • LLMOps (prompt / template management, deployment, guardrails)
  • BI tooling, vector DB
3 AI‑Human Collaboration

AI agent(s)/system(s) acts as an entry-level functional reviewer, performing context-aware end-to-end technical or functional reviews  of an item under test  while a human overseer guides and refines its output. It can also generate stakeholder‑specific narrative reports and offers interactive Q&A on QA

  • Agentic dashboards
  • causal analytics
  • Agentic frameworks
  • orchestration
  • LLM + code embeddings, s, test‑data synthesis libs
  • MCP
  • AI Agents / Autonomous 
  • LLMOps (prompt / template management, deployment, guardrails)
  • Agentic frameworks
    • Langgraph as orchestrator, vector DBs as knowledge, prompts and a model for interaction, 
  • IDE’s with AI capabilities
  • CLI code assistants
  • Off-the-shelf tools***
    • Katalon, Applitools, UIPath, Testim, testers.ai, …
4 Intelligent Automation

An autonomous quality‑governance agent continuously evaluates exit criteria and reports via live data, launching extra tests when needed. Human involvement is confined to strategic governance, on demand, the autonomous AI must supply a transparent, traceable explanation of its actions, input data, and decision rationale.

Autonomous agents, MLOps/CD integration, real‑time data streams
  • End‑to‑end quality governance platforms. (I have no knowledge of tools who operate at his level)

*** Although many tools claim to operate at AI Maturity Level 3, these claims are often exaggerated, they typically require significant manual effort, lack true context awareness, and rely heavily on marketing buzzwords like “self-healing tests,” “autonomous agents,” “AI-driven quality,” “zero-touch automation,” “intelligent test orchestration,” and “continuous risk-based optimization.” In truth, most of these tools work more like Level 2, they help people but don’t really work alongside them. They still need detailed prompts, human guidance, and corrections to get good results. That said, some tools are starting to explore real Level 3 features. Early versions show potential. Progress is slow but steady, with better context awareness and more independence pushing things forward.

  AI Maturity Level: Indicates the level the technology vendors claim to have reached in deploying AI solutions that actually work in real-world applications

Test control

The ISTQB of comparing actual progress with planned progress, analysing variances, and taking corrective actions to meet test objectives.

AI Maturity Levels

Level

Name

Description

Technology

Example tools

0 Non-Existent

 All test control is manual with no AI or automation. Variances are tracked by hand, with no predictive insights or decision support.

None None
1 One-Off Assist

Test coordinators sporadically prompt an LLM to estimate remaining effort and surface process gaps or test debt,

  • Off-the-shelf LLM & prompt engineering
  • MCPs on off-the shelf LLMs
  • code completers in IDE’s
2 Integrated Assist

Specifically for this subdomain an AI is embedded in the test process to predict schedule slippage, KPI drift, and recommend minor scope changes or scope re-balance. Tasks are standardised, outputs consistent, and embedded in the test process.

  • LLM + code embeddings, s, test‑data synthesis libs
  • MCP
  • AI Agents / Autonomous Agents
  • LLMOps (prompt / template management, deployment, guardrails)
  • vector DB’s, BI tooling
3 AI‑Human Collaboration

AI agent(s)/system(s) acts as an entry-level test coordinator who performs continuous what‑if analysis, correlates business impact, and proposes corrective test actions with explanations.  It answers complex test control queries. Humans oversee and validate.

  • Causal inference models
  • simulation
  • agentic planners
  • LLM + code embeddings, s, test‑data synthesis libs
  • LLMOps (prompt / template management, deployment, guardrails)
  • MCP
  • vector DB’s, BI tooling
  • Agentic frameworks
    • Langgraph as orchestrator, vector DBs as knowledge, prompts and a model for interaction, …
4 Full Autonomy

An autonomous agent simulates, forecasts, adjusts scope/staffing, and answers analytic queries. Human input is strategic only, with explainable outputs

Autonomous agents, reinforcement scheduling, closed‑loop governance End‑to‑end adaptive QA orchestrators (I have no knowledge of tools who operate at his level)
  AI Maturity Level: Indicates the level the technology vendors claim to have reached in deploying AI solutions that actually work in real-world applications