Agentic DevOps in the Autonomous Cloud

Architecting the Future of Application Modernization

InnoTech Austin 2026

Akshay Mittal, Ph.D.

Member of Technical Staff, Software Engineering · PayPal

From manual orchestration to agentic automation: AI agents in modernization workflows—scale, reliability, speed—and a path from traditional CI/CD toward self-healing, cloud-native systems.

Disclaimer: All content shared represents my personal research and professional interests. This presentation is not affiliated with or endorsed by PayPal or any other organization.

[Legend: anything in square brackets is for you only—timing, stage direction, polls meta—do not read aloud.]

[~1 min. Target 25–30 min through Thank You. Skip References live unless asked.]

Hi everyone—thanks for being here at InnoTech.

I'm Akshay Mittal.

I work as Member of Technical Staff in software engineering at PayPal, and I have a Ph.D.

[Pause.]

Quick disclaimer: this deck is my personal research and opinions—not PayPal speaking, and not an official company position.

Here's the thread for the next half hour.

Agentic DevOps means AI that doesn't only suggest—it can take bounded actions in your pipelines and operations.

The hard part isn't the demo.

It's modernization without losing control.

[Next slide: one quick show of hands.]

For now—let's talk about why this actually hurts in twenty twenty-six.

The $44.5 Billion Problem

Cloud waste & complexity — industry headline + latest Flexera field data

$44.5B

Harness cloud waste study (2025, “FinOps in Focus”) projects ~$44.5B annual enterprise infrastructure cloud waste (~21% of spend)—pair with Flexera compute + platform waste % on chart.

Use as the attention anchor; pair with chart →

ESTIMATED COMPUTE + PLATFORM WASTE (SHAPE)

'20

'21

'22

'23

'24

'25 27%

'26 29%

Bars 2020–24: schematic downward era (Flexera: five-year decline). 27% Flexera 2025 report · 29% Flexera 2026 (N=753): first rise—new AI workloads + new services. Not to-scale year-by-year.

73% hybrid estates (Flexera State of the Cloud '26) 230+ cloud-native foundation projects (CNCF Annual Report '25) ~7 h/wk lost to manual bottlenecks (GitLab Global DevSecOps '25) ~50% saw significant DC outage (3 yr, Uptime Inst. survey '25)

The deeper crisis: cognitive overload — not lack of automation.

Signals outpace human judgment · Alert noise & on-call load still rising (DORA, vendor SRE/monitoring reports 2024–26)

[~2 min. Let the big number land—silence beats chatter.]

Forty-four point five billion—Harness's wasted cloud infrastructure study; the formal title on the slide is FinOps in Focus.

It's a headline anchor, not your CFO's line item—treat it that way.

[Point at gray bars.]

Twenty twenty through twenty four are intentionally schematic—shape, not pixel-perfect history.

The labels you can defend in a meeting are twenty seven percent wasted cloud compute and platform services spend—Flexera twenty twenty-five.

Twenty nine percent in twenty twenty-six—first uptick in five years—new AI workloads spreading fast, plus new services.

[Pause.]

[Poll—say out loud, not on slide:]

Hands up if your pain lately is less "we can't automate" and more alerts, approvals, or too many overlapping tools.

[Pause.]

Keep them up if root cause took longer than the fix in the last six months.

[Pause.]

The yellow strip is the punchline for engineers and managers alike.

Hybrid is normal.

The Cloud Native Computing Foundation numbers remind you how big the landscape is.

GitLab's survey clocks serious weekly time lost to manual workflow.

Uptime-style data reminds you outages still happen.

So the bottleneck isn't bash—it's bandwidth.

[Next slide: name the bottleneck—cognition.]

The Cognitive Overload Challenge

Cognitive Overload vs AI Processing Capacity

The bottleneck has moved from execution to cognition

(Cognitive load theory; e.g. Sweller; applied to DevOps)

Traditional DevOps Question:

"How do we automate repetitive tasks?"

Modern Challenge:

"How do we scale operational intelligence?"

Making context-aware decisions about deployments, incident response, and resource optimization across 1000+ microservices

Key Question:

How do we scale operational intelligence without linearly scaling our engineering teams?

The solution isn't more humans—it's AI agents that think and act autonomously

Declarative everything (2026 prerequisite)

Agents need machine-readable truth: IaC (Terraform, Pulumi), K8s manifests, policy-as-code, golden paths. You steer by desired state + guardrails, not ad-hoc scripts—otherwise autonomy has nothing solid to reconcile against.

[~1.5 min. Gesture at the diagram—human curve vs complexity curve.]

Your brain didn't get a major version bump in the last decade.

Kubernetes did.

Microservices did.

AI did.

That gap is where pages happen and SLOs slip.

[Optional joke—say only if room feels warm:]

Count your observability tabs in your head—I'll wait.

[Purple callout:]

Declarative everything isn't vendor poetry.

It's Terraform, manifests, policy-as-code, golden paths—machine-readable truth.

If desired state is tribal knowledge, autonomy has nothing to reconcile against.

You won't hire your way out of this curve.

You need systems that carry context like a staff engineer.

[Next slide: what changes when those systems can act—not just chat.]

From Read-Only to Execution-Ready AI

Traditional vs. agentic shift (Gartner, 2025) | Read-only to execution-ready adoption (McKinsey, 2025)

Traditional AIOps

Ingests data →
Detects anomalies →
Creates alerts →
Recommends actions

Human is the executor

Agentic AIOps

Ingests data →
Reasons →
Plans →
Executes →
Learns

Human is the supervisor

Adoption (illustrative): surveys vary widely—treat ~50% production / ~69% human-verified agentic decisions as talking-point ranges, not audited facts (replace with your vendor’s primary survey before a board pack)

[~1.5 min. Tap red box, then green.]

Red is the world many of us still live in: great alerts, same human holding the screwdriver.

Green is the shift: observe, reason, plan, execute with guardrails, learn.

You don't disappear—you move up a layer.

You become the approver, the auditor, the escalation path.

[Call out the blue adoption line.]

Those percentages are illustrative ranges—surveys disagree—say that so the architects in the room don't torch you in Q and A.

[Poll:]

Who's still recommendation-only?

[Pause.]

Who's let an agent touch sandbox or staging with guardrails?

[Pause.]

Both answers are normal—this is a transition, not a scoreboard.

[Next slide: five-layer architecture—keep it tight.]

Technical Architecture: Inside an AI Operations Agent

Technical Architecture: 5-Layer AI Agent

5-layer architecture enables AI to see, think, plan, act, and learn like senior SRE

(Standards: IEEE P3833 draft—proactive agent HCI framework; IEEE 3119-2025—AI procurement risk, not stack topology; plus vendor patterns)

1. Perception Layer

Signals from logs, metrics, traces, and user sentiment

2. Reasoning Engine

LLM + knowledge graph; step-by-step reasoning; doc search over runbooks

3. Planning System

Multi-step execution plans with verification and rollback

4. Action Framework

API integrations + guardrails for safe execution

5. Learning Loop

Continuous improvement from outcomes and feedback

Production Implementation:

Prometheus + OpenTelemetry | Vector databases (e.g. Pinecone/Weaviate) | Kubernetes operators | semantic search at scale (embedding counts/accuracy—deployment-specific; cite your own evals)

[~2 min. Use the diagram as spine—do not read every bullet.]

Five layers—not one magic model.

Perceive broadly: logs, metrics, traces, tickets—not just CPU graphs.

Reason with LLM plus graph and runbooks—chase causation, not vibes.

Plan with verification steps and rollback paths.

Act through bounded APIs into cloud and Kubernetes—with rate limits and humans in the loop.

Learn from outcomes so the same outage doesn't graduate with honors every quarter.

[If standards question comes up:]

P3833 on the slide is draft direction for agent frameworks.

IEEE 3119 is about procurement risk when buying AI—not a diagram of your runtime stack.

[Optional hands:]

Who's shipped LLM ops beyond a Slack chatbot?

[Pause.]

[Next slide: latency scenario—keep it concrete.]

Real-World Implementation

Scenario: API Latency Spike Detection

API Latency Spike Detection Process Flow

Multi-signal correlation (logs + metrics + traces)
Root cause hypothesis generation
Automated remediation plan creation
Human-in-the-loop approval
Execution and verification

Safety Metrics

Example targets: 95% accuracy | 3% false positive | 100% human approval for production changes

Traditional Approach

2+ hours

Manual investigation and remediation

Agentic Approach

3 minutes

Automated detection and remediation

Key Insight

Handle routine issues automatically, escalate complex problems to humans

Safety Protocols:

Confidence <80% → Escalate | High-risk → Approve | All → Audit trail

[About a minute and a half. This is the late-night incident slide—if you see SREs nodding, you're in the right register.]

Everyone in this room has lived some version of this: an API goes slow, the graph looks guilty, and you're still piecing the story together at two in the morning.

Walk it once with your finger on the diagram, not every bullet. You're telling a story: pull in logs, metrics, and traces together; let the system propose what might actually be wrong; draft a fix plan; a human signs off on anything that could hurt production; then you run it and prove it worked.

The time boxes on the right are illustrative. Your shop might never hit three minutes, or you might beat it in staging—say that plainly so nobody treats it like a benchmark audit.

The point underneath is what matters: the boring, well-understood path is what you automate first. The weird edge cases and the low-confidence guesses still go to a human.

The safety strip isn't there to sound virtuous. We're not trying to remove people from the loop; we're trying to remove heroics and untrusted changes from the loop.

If someone laughs at "three minutes," agree with them. That kind of trust comes from shadow mode, read-only runs, and receipts—not from a slide.

[Next slide: cloud provider arms race—keep it a quick map, not a product catalog.]

The Cloud Provider Arms Race

AWS

Amazon Q Developer - Natural language infrastructure; Cloud Control API MCP Server (1,200+ resources)
Agentic AI on Bedrock - Foundation services for custom agents
DevOps Guru & CodeGuru - ML-powered operational insights

Azure

GitHub Copilot for Azure - GA with Agent Mode (June 2025)
Azure SRE Agent - Autonomous incident response (Build 2025)
GitHub Advanced Security - ML vulnerability scanning

Google Cloud

Vertex AI Agent Builder - Unified workbench; Google Cloud ROI-of-AI reporting: 88% of agentic-AI early adopters cite positive ROI (verify wording in current Google Cloud study)
AI-Orchestrated CI/CD - Adaptive pipeline automation
Cloud AI Platform - Enterprise integration services

Also in play — executable for DevOps teams:

GitLab Duo Agent Platform (GA Jan 2026): Planner Agent, Security Analyst Agent; Agentic Flows run asynchronously (e.g. resolve vulnerability + open MR without human waiting). Qovery Agentic DevOps Copilot: Infrastructure-specific agents; 4-phase maturity (Basic → Agentic → Resilience → Memory). (GitLab, Qovery 2025–26)

Selection Strategy:

AWS: >70% AWS footprint

Azure: GitHub/Microsoft ecosystem

GCP: ML/AI intensive workloads

Reality: 85% multi-cloud orchestration required · See also Microsoft’s Agentic DevOps practice framing (principles & strategic direction, 2025–26)

[~2 min. Dense slide—market map, not flashcards.]

AWS: natural language meets enormous control planes.

Azure: agents where many of you already live—GitHub and Azure.

Google: unified builder story for bespoke agents—and yes, check Google's own ROI wording before you quote the eighty eight percent line verbatim.

[Blue box:]

GitLab Duo and the Qovery platform aren't science fiction—they're "your DevOps vendor shipped automation you can actually run."

[If running long—skip purple grid.]

Pilot where spend and pain already live.

Hybrid isn't shame—it's physics.

[Next slide: open source and glue—credibility for the builders.]

Ecosystem Update: New Entrants & Open Source

Recent DevOps AI & standards (2025–26) — sources verified

Kagent (Cloud Native Foundation sandbox)

First open-source agentic AI for K8s. Solo.io → foundation sandbox (May 2025). MCP-style tools: K8s, Prometheus, Istio, Argo. 100+ contributors, 1,000+ stars in 100 days.

CNCF Blog, Solo.io, GlobeNewswire

Docker AI Agent (Gordon)

Beta Feb 2025. Desktop & CLI: Dockerfiles, troubleshooting, vulns, Hardened Images. Search-backed answers from Docker docs.

Docker Blog, Docker Docs

Model Context Protocol (MCP)

Open standard (“USB-C for AI”). AWS Cloud Control, Opsera, GitHub registry. 7,190+ stars, 320 contributors.

modelcontextprotocol.io, Anthropic

Opsera Hummingbird AI

Oct 2025. Reasoning agents; “Insights in a Box,” GitHub MCP. Natural-language why/what/impact/ROI. Cisco, Honeywell, Qualys, Sephora.

Opsera Newsroom, PR Newswire

Harness AI (GA Aug 2025): “Everything after code”—testing, security, deployment, optimization; agentic pipelines, natural-language policy. ~80% SDLC failures post-coding. (Harness Blog, 2025)

The Agentic Tool Landscape (2026 Update)

Not just custom code — major platforms have launched agentic layers

GitLab Duo Agent Platform (GA Jan 2026)

Planner Agent — structure, prioritize, break down work. Security Analyst Agent — vulnerability triage, risk assessment, false positive ID. Agentic Flows — one or more agents combined; run asynchronously in the background (e.g. resolve a vulnerability and open an MR without human waiting). Agentic Chat across Web UI and IDEs.

GitLab Press Release, GitLab Docs (Jan 2026)

Qovery Agentic DevOps Copilot

Infrastructure-specific agent. 4 phases of maturity: Basic (intent-to-tool), Agentic (planning), Resilience (self-correction/retry), Memory (cross-session context). Multi-step workflows, root-cause diagnosis, natural-language infra ops. Read-only mode as default; read-write coming. Console, Slack Bot, MCP Server.

Qovery Blog, Qovery Docs (2025)

Google Agent2Agent (A2A) Protocol

Open protocol (announced Apr 2025) so agents from different vendors can talk to each other—e.g. Salesforce and Google Cloud. Capability discovery via “Agent Cards,” task-oriented communication, enterprise auth. 50+ partners (Atlassian, SAP, Salesforce, ServiceNow, Accenture, Deloitte). Enables cross-platform agent collaboration.

Google Developers Blog, a2a.cx, google.github.io/A2A

Agentic Maturity Model

Where you are vs. where you need to go — after Qovery

1. Basic

Intent → Tool

Simple intent-to-tool mapping, hardcoded logic. Predictable; manual tool chaining. Limited flexibility for unexpected requests.

2. Agentic

Planning

Dynamic planning: analyze request, sequence tool invocations autonomously. Solves unanticipated needs; exposes fragility in tool chaining and errors.

3. Resilience

Self-Correction

Retry logic, robust error handling. Retry with corrected approach; validate intermediate states; re-plan if execution fails.

4. Memory

Cross-Session

Context across multiple requests; follow-up questions; continuous learning. No longer each request in isolation.

Pilot tip: Start at Basic or Agentic with read-only mode as default; enable write only after guardrails and human-in-the-loop are in place. (Qovery: read-only default; read-write coming)

[~1.5 min. Ladder story—left to right.]

Basic: safe, predictable, runs out of runway.

Agentic: where demos get spicy and toolchains break loudly.

Resilience: retries, replans, reality.

Memory: you stop re-explaining your architecture every Monday standup.

[Poll—say aloud:]

Think your real platform team—not a vendor demo—which phase are you most days—one, two, three, or four?

[Pause.]

[If mostly ones and twos: normalize it. If many threes: light joke—they're hiring.]

Yellow box once: read-only default, human gates, then write.

[Next slide: security—earn the room before optimism.]

The Security Crisis: Shadow AI & Non-Human Identities

Addressing the #1 objection proactively

The Threat: Shadow AI

Unmanaged AI agents and identities operating outside oversight. In 2026 the risk isn’t only humans pasting secrets—it’s unmanaged agents doing “vibe coding” (rapid, unvetted AI generation) and creating infrastructure backdoors. Non-human identities (NHIs)—service accounts, workload identities, APIs, agents—already represent the majority of identities in many enterprises (e.g. 45:1–92:1 NHI-to-human in some studies). Agents autonomously initiate actions and access data with unmanaged credentials; traditional identity models break down.

DoControl, IBM, Astrix, Hush, Token Security (2025–26)

The Solution: Identity-Based Security

Every agent gets its own non-human identity (NHI) with strictly scoped RBAC (Role-Based Access Control). No shared or hardcoded secrets. Runtime guardrails: continuous monitoring, policy enforcement, anomaly detection. Kill switch: one-click revocation and remediation. Start agents in read-only mode; promote to write only with approval gates and audit trails. OpenID and industry frameworks (2025) outline auth and authorization for agentic systems.

Akeyless, Hush, OpenID Foundation (2025)

Takeaway: Security isn’t a reason to avoid agentic DevOps—it’s a reason to govern it. NHI + RBAC + runtime guardrails + kill switch + read-only default = the baseline for production agents.

[~2 min. Tone: pragmatic CISO ally, not cheerleader.]

Shadow AI isn't "people are reckless."

It's workloads and agents showing up without owners, budgets, or reviews.

Machine identities—service accounts, bots, API keys—are boring until they're the blast radius: shared tokens, accounts multiplying out of control, agents acting with mystery credentials.

[Green column—plain English:]

Scoped identities.

Role-based access—who can do what—that fits reality.

Runtime guardrails.

A kill switch that actually works in a drill—not only on a slide.

Read-only first—promote to write with approvals and audit.

If someone yells hallucinations—agree—that's why pilots start read-only.

[Next slide: surveys—pair hype with this discipline.]

Research & Reality: Platform Engineering & Incident AI

2025–26 surveys & vendor reality — adoption, gaps, and frontiers

State of AI in Platform Engineering 2025

88% use AI daily (code gen 75%, docs 71%). 73% say AI is central to org goals; 90% expect it to transform their future. But: 59% report skill gaps; 56% worried about hallucinations; “implementation plateau” between experiments and measurable ROI.

Platform Engineering.org, Weave Intelligence, Vultr (204 respondents)

Incident Management & SRE

79% of teams exploring AI for incident trending (Atlassian 2025, 500+ respondents). 74% cite security as barrier to expanding AI. Research: multi-agent LLM systems (e.g. STRATUS) 1.5× prior SRE agents; IBM LLM-assisted anomaly detection—500+ users, 200K+ API calls/year.

Atlassian, IBM Research, arxiv (2025)

Agentic Quality Engineering (2026)

Vendors like Tricentis now market end-to-end agentic QE—AI interpreting change risk, auto-directing tests, NL → tests, performance agents—so velocity from Copilot-class tools doesn’t outrun verification. (Tricentis press/blog, 2025–26)

Research: authority transfer in CI/CD

Academic framing: moving from “assist” to delegated agency in pipelines—design constraints, not vibes (e.g. arXiv:2605.07062, 2026).

Takeaway

Adoption is mainstream; durable ROI needs QE + governance + golden paths. Platform engineers as “architects of enterprise AI.” Continuous AI (GitHub) and agentic CI handle judgment-heavy tasks rules can’t. Human oversight and security remain non-negotiable.

[~2 min. Two-column trick: left is optimism, right is friction—mixed audiences need both.]

Platform teams live in the left column daily—AI in the loop is normal now.

The right column is the grown-up footnote—security still gates expansion.

Say the quiet part: many shops are past novelty and stuck in an implementation plateau—pretty demos, fuzzy ROI.

[Pink strip—one breath:]

Agentic quality engineering is how tests and risk signals keep up when Copilot-class tools ship diffs faster than humans can read them.

[Indigo strip—for the engineer who wants citations:]

For the citation crowd: there's a twenty twenty-six open archive paper on who's allowed to change the pipeline—authority hand-off in CI/CD—not vibes. The ID is on the slide if they want to look it up.

[If short on time: read only yellow takeaway box.]

[Next slide: numbers—say "directional" three times.]

Quantified Business Impact

Metrics from Real Implementations

Sources: Forrester Wave DevOps Platforms Q2 2025; industry impact benchmarks · See also Forbes / Cortex — “Quality Tax” discussion (AI-accelerated dev)

The “Quality Tax” (counterweight to speed): Industry coverage cites striking figures—for example ~43% of AI-generated code still needing production debugging post-QA/staging in some analyses, alongside telemetry such as +23.5% incidents per PR and ~+30% change failure rate signals in surveyed org workflows (reporting citing Forbes contributors & Cortex, 2025–26). Use as directional risk sizing, not a promise for your KPI sheet.

45%

Reduction in deployment lead times

50%

Decrease in production incidents

80%

Faster incident resolution (MTTR)

30%

Improvement in developer productivity

60%

Reduction in cloud infrastructure costs

3.2x

Faster time to market for new features

Embedding agents into workflows (vs bolt-on): 30–50% faster processes, up to 40% reduction in low-value work (BCG research cited by Dynatrace, 2026). Industry benchmarks: 50% faster processing of operational inquiries; up to 80% toil reduction in resolution workflows where agents are used end-to-end.

CHALLENGE: Calculate potential savings for your team.

What's 80% MTTR improvement or 50% faster inquiry resolution worth to your organization?

Case Study: Thomson Reuters - GitHub Copilot at Enterprise Scale

Challenge

Developer efficiency across 10,000+ engineers

Solution

Structured 7-week Copilot pilot with 100+ senior engineers

Key Innovation

Agent-assisted code review and automated testing

46%

Faster task completion

39%

Improvement in code quality

45%

Faster PR velocity

$2.3M

Annual productivity savings

Implementation Context: 7-week pilot with 100+ engineers → scaled to 2,000+ developer seats | GitHub Analytics & surveys | 68% positive UX

Source: GitHub Universe 2025 / Thomson Reuters AI adoption (GitHub Inc., 2025)

Key Insight:

Engineers initially skeptical became strongest advocates - AI coding assistance is now mandatory for all new projects

[About a minute to ninety seconds. Lead with the name once—Thomson Reuters—then spend your time on how they ran the pilot, not on reading every number off the slide.]

Frame it like this: more than ten thousand engineers is not a flex by itself. It's a coordination problem. The interesting part is that they didn't just flip a switch; they ran a structured seven-week pilot with over a hundred senior engineers and actually measured what happened.

Let the big tiles do the work: faster task completion, better reported code quality, faster pull requests. You can say "mid-forties percent" if you don't want to sound like you're selling mattresses.

The dollar savings line is real headline material, but it's also the easiest line for a skeptic to poke. If you're not ready to defend the GitHub Universe sourcing in Q and A, skip the dollar figure out loud and keep the qualitative story.

The line I would not skip is the human one. A lot of engineers walked in skeptical. They turned into advocates when the program had grown-up measurement—analytics, surveys, a real feedback loop—not when someone said "trust the vibe."

Close that thought in one sentence: once leadership saw evidence, Copilot-style assistance stopped being optional for new work. Evidence first, mandate second.

[Next slide: ninety-day roadmap—tell them it's a scaffold they can steal, not a contract.]

Your 90-Day Roadmap to Agentic Operations

1-30

Assess & Govern

Identify top 3 operational bottlenecks
Establish AI Governance Council
Define runtime controls: agent IDs, audit trails, human override
Select low-risk, high-impact pilot

Team: 1 PM (.5), 2 Eng (.3), 1 Arch (.2) · Budget: $50K · Out: Governance, pilot selected

31-60

Pilot & Learn

Deploy AI coding assistant to pilot team
Agentic AIOps in read-only mode (default); runtime guardrails
Train: prompt engineering, AI supervision

Team: +3 pilot (.4), 1 ML (.3) · Budget: $150K · Out: AI assistant, 80% accuracy

61-90

Automate & Expand

Enable first execution-ready agent
Analyze results, build business case
Develop long-term roadmap

Team: 8 total (.3 FTE each) · Budget: $100K · Out: First autonomous action, ROI

Success Metrics: 80% MTTR reduction · 50% deployment acceleration · $2M+ annual savings · 90% team adoption

COMMITMENT CHECK: Who will start Phase 1 within 30 days? [Show of hands]

Who wants the detailed implementation checklist?

[~2 min. Ninety days is a stunt headline—defend it as staged risk reduction.]

Thirty days: pick pain, pick pilot, write governance that fits your culture—not a PDF museum.

Sixty days: shadow mode, read-only agents, teach supervision—not prompt theater.

Ninety days: one narrow execution path with receipts you can show a VP.

[Budget and FTE micro-font: skim only—it's for photo-of-slide people.]

[Commitment check—say sincerely, not pep rally:]

Who will start phase one inside thirty days?

[Pause.]

No QR checklist?

Say you'll follow up on LinkedIn—and actually do.

[Next slide: seven takeaways—don't read the wall.]

Key Takeaways

1. Autonomy is the New Automation

The industry is moving from "read-only" recommendations to "execution-ready" AI agents

2. Govern Before You Automate

Governance includes runtime controls (agent IDs, audit trails, human override), not just policy. Trust and accountability are prerequisites.

3. Platform Engineering is the Governance Layer

Agents are the engine; the IDP defines the Golden Paths agents are allowed to walk—making autonomy scalable and governed

4. The Cloud Providers are All-In

AWS, Azure, and GCP are building their futures around agentic AI

5. Start Now

Audit your current operational bottlenecks this week
Establish AI governance council within 30 days
Select pilot use case by month-end

6. Safety + Agentic QE

Autonomous doesn't mean unsupervised. Runtime guardrails, human-in-the-loop, and a kill switch are non-negotiable. Pair speed with agentic quality engineering (risk-aware test direction, NL→tests) so AI-generated change doesn’t outrun verification. Start pilots in read-only mode; enable write only after guardrails and audit trails are in place.

7. The Multi-Agent Future

By 2026, specialized agents collaborate: Code-Gen → Security-Scan → Deploy → Monitor. Ecosystems already combine AWS Kiro, GitHub Copilot, ServiceNow Assist, Azure SRE Agent. Start building now.

Nature Machine Intelligence (2025) · trade press on multi-agent roadmaps · Dynatrace/BCG on embedded agents—cite specific URLs before filing.

[Roughly ninety seconds to two minutes, depending how much of the right column you use. Sound like you're closing a retro, not selling a keynote trailer.]

There are seven bullets up here. I'm not going to read them line by line—that would put us both to sleep.

Instead, three things I want you to carry out the door.

First: the shift is real. We're moving from AI that whispers suggestions to AI that can actually run the play—under supervision, but execution-ready, not just chat.

Second: none of that works without governance first. Runtime controls, identities you can audit, humans who can still say no. Policy decks alone don't cut it anymore.

Third: platform engineering is where that becomes boring and safe. Golden paths aren't marketing—they're how you let agents move fast without turning every repo into the Wild West.

[If you have another minute, skim the right column with your hand—don't narrate every sub-bullet.]

Five is the homework slide: audit bottlenecks, stand up a small governance group, pick one pilot. Say it plain: start this month, not "when things calm down."

Six is the one I want security and quality folks to hear in full. Autonomous does not mean unsupervised. Read-only pilots, real guardrails, a kill switch, and quality engineering that keeps up with how fast the machines generate change. That's the bar for production.

Seven is optional color: we're already seeing chains of specialized agents—generate, scan, deploy, watch. If you only have one sentence left, say we're heading toward teams of narrow agents, not one god model, and leave the vendor names on the slide.

This isn't science fiction on a five-year horizon. It's the conversations people are already having in hallways and Slack threads.

[If the room feels open, ask one real question—then stop, thank them, and hand it back.]

Thank You

The future of DevOps is autonomous.

Your competitive advantage depends on how quickly you start.

Let's Connect

Scan QR code to connect on LinkedIn

Disclaimer: Personal research and professional interests only. Not affiliated with PayPal or any organization.

LinkedIn Profile

Questions? Let's discuss how to implement agentic operations in your organization.

InnoTech Austin 2026 · Agentic DevOps in the Autonomous Cloud

References

Atlassian. (2025). State of AI in incident management / ITSM coverage (verify exact report title on atlassian.com).

CNCF. (2025). Kagent: Bringing Agentic AI to Cloud Native. CNCF Blog.

CNCF. (2025). Annual Report 2025 (project counts, ecosystem).

Datadog. (2025). State of Monitoring / observability research. Datadog Inc.

Docker. (2025). Docker AI Agent (Gordon) beta. Docker Blog.

DORA / Google Cloud. (2025). State of AI-assisted Software Development 2025. dora.dev.

DORA. (2025). Accelerate State of DevOps Report 2025. Google Cloud.

Flexera. (2025). State of the Cloud Report 2025 (IaaS/PaaS waste, survey methodology).

Flexera. (2026). State of the Cloud Report 2026 (N=753; 29% IaaS/PaaS waste; hybrid trends). Flexera.

Forbes Business Council. (2026). Contributor articles on AI-accelerated development risk (“Quality Tax” framing—verify author/title).

Cortex. (2025–26). Engineering metrics & change failure discussions (vendor blog/docs).

Forrester. (2025). The Forrester Wave™: DevOps Platforms, Q2 2025. Forrester Research.

Gartner. (2024–25). Cloud forecasts & hybrid/multi-cloud trend press releases and hype cycles. Gartner Inc.

GitLab. (2025–26). Global DevSecOps Report 2025; GitLab Duo Agent Platform GA (Jan 2026 press/docs).

GitHub. (2025). Thomson Reuters AI adoption / Copilot case materials. GitHub resources.

Google. (2025). Agent2Agent (A2A) Protocol. Google Developers Blog; a2a.cx; google.github.io/A2A.

Harness. (2025). FinOps in Focus / infrastructure waste projection (~$44.5B)—press release & report. Harness.

Harness. (2025). Harness AI SDLC announcements. Harness.io Blog.

IBM Research. Operational AI / SRE analytics (verify specific paper or product brief).

IEEE SA. (2025). IEEE 3119-2025 (AI procurement). IEEE Standards Association.

IEEE SA. (2025). IEEE P3833 (draft PAR—proactive AI agent framework). IEEE Standards Association.

Model Context Protocol. (2025). Specification. modelcontextprotocol.io.

McKinsey & Company. (2025). Enterprise AI automation reports (verify exact title before citing).

Microsoft. (2025–26). Agentic DevOps framing. Microsoft Developer Blogs.

Mittal, A. (GitHub). AI-Augmented DevOps… github.com/akshaymittal143/AI-Augmented-DevOps

OpenAI. Model cards & evals—verify any benchmark numbers against primary OpenAI + benchmark publishers.

Tricentis. (2025–26). Agentic quality engineering announcements. Tricentis.

Barnes, M. E., Ghaleb, T. A., & Hassan, S. (2026). From Assistance to Agency: Rethinking Autonomy and Control in CI/CD Pipelines. arXiv:2605.07062.

Nature Machine Intelligence. (2025). Volume 7—multiple peer-reviewed articles on AI risk, transparency, deployment (use specific DOIs when citing).

OpenID Foundation. Identity and authorization materials relevant to agents (verify whitepaper title).

Opsera. (2025). Hummingbird AI / MCP announcements. Opsera Newsroom.

Qovery. (2025). Agentic DevOps Copilot; maturity phases. Qovery Blog & Docs.

PagerDuty. Annual reporting & operations research (verify year/title).

Platform Engineering.org / Weaveworks. State of Platform Engineering / AI surveys (verify edition).

Security / NHI. (2025–26). Shadow AI & non-human identity vendors (Astrix, Akeyless, etc.)—use vendor primary sources.

Solo.io. (2025). Kagent framework & CNCF Sandbox donation materials.

Sweller, J. Cognitive load theory (general education literature—use established CLT sources; do not invent venue-specific citations).

Uptime Institute. (2025). Global Data Center Survey / outage analysis (verify statistic wording against Uptime PDFs).