Enterprise AI Executive
Posts
MIT-Microsoft ranks 101 tasks by agent confidence

MIT-Microsoft ranks 101 tasks by agent confidence

Plus, Goldman Sachs industrial AI, Deloitte AI operating models, and more.

Lewis Walker
July 01, 2026

Edition in partnership with

Welcome executives and professionals. With scenarios that are high-stakes or irreversible, AI agents can make recommendations, but humans should have the final say.

Since the previous edition, we have reviewed hundreds of the latest insights in agentic and generative AI, spanning best practices, case studies, market dynamics, and innovations.

This briefing outlines what is driving material value — and why it’s important.

In today’s briefing:

The 2026 Agent Confidence Index.
Harnessing AI for the real economy.
Rewiring the operating model for AI.
Readiness, realities and roadblocks.
Transformation and technology in the news.
Insights for Executive+ members.
Career opportunities & events.

Read time: 4 minutes.

MARKET & BEST PRACTICE INSIGHT

The 2026 Agent Confidence Index

Image source: MIT

Brief: MIT, in partnership with Microsoft, surveyed 300 global technology experts, ranking 101 tasks across AI, data, and cloud workflows by their confidence in agents to independently perform each task on their behalf.

Breakdown:

Respondents are exceedingly confident about using agentic AI across a significant share of AI, data, and cloud tasks.
Confidence is highest for generating reports and boilerplate code (image above), with opportunity where tasks involve multi-step reasoning.
Agent readiness drops mainly from a lack of business context. Complex tasks demand greater reasoning capability and deeper context.
Keeping humans in the loop remains critical to 59% of respondents, especially for the more complex, higher-stakes tasks.

Why it’s important: The ultimate promise of agents is to manage and coordinate entire workflows, pursuing business goals alongside humans. Given the risks of automated decision-making, teams cannot delegate work without confidence agents can perform it safely and securely.

IN PARTNERSHIP WITH TELEPORT

Your infrastructure is dynamic. Your credentials aren't.

Brief: GPU nodes join and leave clusters in minutes. Training jobs run for hours or weeks, then terminate. Your access controls weren't built for infrastructure this dynamic. Teleport secures every engineer, node, workload, and AI agent with a unified identity layer.

With Teleport, you get:

Zero standing privileges across every cluster, cloud, and GPU node
Cryptographic identities for engineers, nodes, jobs, and AI agents
Short-lived certificates instead of siloed, static credentials
A unified audit trail that cuts audit prep up to 80%

From training to deployment, see how Teleport secures every identity across your infrastructure.

MARKET & BEST PRACTICE INSIGHT

Harnessing AI for the real economy

Image source: Goldman Sachs

Brief: Goldman Sachs published a 34-page report on seizing the AI innovation supercycle across the digital and real economies, as AI drives industrial transformation faster, broader, and more capital-intensive than ever before.

Breakdown:

Software incumbents must ship AI-native capabilities, disrupt their own economics, and build toward the emerging value hierarchy.
With the digital infra buildout underway, AI now moves into the real economy, physical AI as the next frontier of enterprise value.
Partnerships, sovereign capital, and structured credit now define AI's expansion, tapping capital pools past transitions never needed.
Winning the AI economy takes strategic vision and decisive capital strategy: flexible, bespoke solutions across the capital stack.

Why it’s important: AI is reshaping every sector, from software and cybersecurity to robotics, defense, and manufacturing. The companies that pair technological vision with capital strategy will define the next era of growth, as AI-native startups scale faster than any company in history.

MARKET & BEST PRACTICE INSIGHT

Rewiring the operating model for AI

Image source: Deloitte

Brief: Deloitte surveyed more than 660 technology executives worldwide; 81% say they can deploy and govern AI at scale today, yet nearly 75% expect their operating model must change in the next 12-18 months to sustain progress.

Breakdown:

Leaders won’t be able to scale AI with operating models from an earlier era (tech largely a support function, decisions hierarchical etc.).
The market is moving toward a new AI operating model, defined less by control and more by continuous coordination across the firm.
Five shifts will shape the future operating model: integrated tech leadership, redesigned human-agent work, dynamic funding etc.
Reporting relationships (image above) offer one signal of the integration work needed for AI to scale across the enterprise.

Why it’s important: Operating model redesign might be the defining leadership challenge of AI's next phase. The real test isn't where functions sit on an org chart, but how leadership, work, capital, risk, and ecosystem relationships get coordinated toward core outcomes.

AI-NATIVE PROFESSIONAL

How to analyze campaign performance with Claude

Brief: In this guide, you'll learn how to analyze campaign performance data across channels to identify your best and worst performers, then get specific, actionable budget reallocation recommendations for next quarter's spend.

Step-by-step:

Tell Claude to analyze your data and build Excel dashboards and Word documents, with recommendations, not just historical summaries.
Ask Claude to identify patterns and opportunities in the data: what's working, what's not, and where to reallocate resources.
Claude analyzes your marketing data and turns it into a strategic review showing exactly where and how your campaigns can improve.
Ask Claude to reformat past analyses so all dashboards stay consistent, preserving data and insights while restructuring everything to match.

Best practice: Use Research to find industry benchmarks and compare performance across all your connected tools and online sources.

For the full guide, including prompts, upgrade to Executive+ or The Boardroom.

MARKET & BEST PRACTICE INSIGHT

Readiness, realities and roadblocks

Image source: Everest Group

Brief: Everest Group published a 27-page report, supported by NTT DATA and drawing on 112 large enterprises, assessing agentic AI readiness, realities, and roadblocks, and what it will take to move forward today.

Breakdown:

The report examines adoption timelines and scale, mapping the maturity spectrum: Explorers, Experimenters, and Scalers.
It also outlines operating models covering budget and execution ownership, plus funding models and sourcing strategies.
As sourcing approaches mature, enterprises' expectations of ecosystem partners are becoming more clearly defined (image above).
The report also covers evolving roles for agentic AI, building talent, best practices for scaling, and the next frontier.

Why it’s important: Looking ahead, the next 12-18 months will define whether enterprises remain in controlled experimentation or successfully transition into scaled, governed AI autonomy, where human-agent collaboration becomes the new operating model for the enterprise.

Deloitte published a 33-page report finding legal AI adoption surged from 24% to 61% in two years, plus a brief on AI accountability.

McKinsey examined how AI data center investment could nearly triple by 2030, and detailed shifting semiconductor investment flows.

Belfer Center published a 19-page paper arguing firms fixate on cutting labor with AI, missing gains from growth-focused strategies.

BCG published a 15-page brief finding most CPG and retail firms remain stuck piloting AI, and detailed its role in underwriting.

AWS detailed how Stripe built a production agent system on Bedrock, cutting compliance review time 26% with maintaining oversight.

Capgemini published a 24-page report on how agentic software engineering is reshaping developer roles from coding toward governing.

Anthropic launched Sonnet 5, its most agentic mid-tier model yet, and announced its 18-day export ban on Mythos 5 is lifted now.

AWS committed $1 billion to a new Forward Deployed Engineering org, embedding engineers in customer teams to speed agentic builds.

Google introduced two new media models to its API: a cheap Nano Banana 2 Lite for bulk jobs and Gemini Omni Flash for video work.

OpenAI found a 'compute multiplier' that more than halves inference costs, per The Information, following its Jalapeño chip's debut.

Google reportedly capped Meta's Gemini usage as soaring AI compute demand outpaced capacity, delaying some internal Meta projects.

OpenAI reset usage limits for all Codex users after fixing a fraud detection bug that let accounts burn through quotas too fast.

Join senior leaders from Google, JPMorgan, and PwC getting ahead.

Access Executive AI Index: The top 291 AI playbooks, by industry and function, with direct links to each. Updated weekly.
Get the extended version of Enterprise AI Executive, twice weekly.
Unlock the full AI-native professional guides in each edition.

CAREER OPPORTUNITIES

Samsung - Head of AI Strategy

Mazda - Enterprise AI Senior Director

Johnson & Johnson - AI Director

EVENTS

Oxford Economics - The AI Spending Wave - July 16, 2026

ACM AI Leadership Summit - August 30 - September 2, 2026

The AI Enterprise Conference - September 1, 2026

Reach enterprise AI decision-makers:

66% of readers are C-level executives or VP and Director-level leaders.
63.2% of the audience is based in the U.S., EU, UK, ANZ, and Singapore.
Read by leaders at Microsoft, Deloitte, the Fortune 500, and more.

Guaranteed impression and custom sponsorship packages available, with post-send performance reporting.

Conceived as a practical communication for executives Lewis Walker has worked with, this briefing has become a trusted resource for thousands of senior decision-makers shaping the future of enterprise AI.

We welcome your feedback.

Lewis, Ashley, Mark