Even Google and Replit struggle to deploy AI agents reliably — here's why

Date:

Share post:

2025 was supposed to be the year of the AI agent, right? 

Not quite, acknowledge Google Cloud and Replit — two big players in the AI agent space and partners in the "vibe coding" movement — at a recent VB Impact Series event.

Even as they build out agentic tools themselves, leaders from the two companies say the capabilities aren’t quite there yet. 

This constrained reality comes down to struggles with legacy workflows, fragmented data, and immature governance models. Also, enterprises fundamentally misunderstand that agents aren’t like other technologies: They require a fundamental rethink and reworking of workflows and processes. 

When enterprises are building agents to automate work, “most of them are toy examples,” Amjad Masad, CEO and founder of Replit, said during the event. “They get excited, but when they start rolling it out, it's not really working very well.”

Building agents based on Replit’s own mistakes

Reliability and integration, rather than intelligence itself, are two primary barriers to AI agent success, Masad noted. Agents frequently fail when run for extended periods, accumulate errors, or lack access to clean, well-structured data. 

The problem with enterprise data is it’s messy — it’s structured, unstructured, and stored all over the place — and crawling it is a challenge. Added to that, there are many unwritten things that people do that are difficult to encode in agents, Masad said. 

“The idea that companies are just going to turn on agents and agents will replace workers or do workflow automations automatically, it's just not the case today,” he said. “The tooling is not there.” 

Going beyond agents are computer use tools, which can take over a user’s workspace for basic tasks like web browsing. But these are still in their infancy and can be buggy, unreliable, and even dangerous, despite the accelerated hype. 

“The problem is computer use models are really bad right now,” Masad said. “They're expensive, they're slow, they're making progress, but they're only about a year old.” 

Replit is learning from its own blunder earlier this year, when its AI coder wiped a company's entire code base in a test run. Masad conceded: “The tools were not mature enough,” noting that the company has since isolated development from production. 

Techniques such as testing-in-the-loop, verifiable execution, and development isolation are essential, he noted, even as they can be highly resource-intensive. Replit incorporated in-the-loop capabilities into version 3 of its agent, and Masad said that its next-gen agent can work autonomously for 200 minutes; some have run it for 20 hours. 

Still, he acknowledged that users have expressed frustration around lag times. When they put in a “hefty prompt,” they may have to wait 20 minutes or longer. Ideally, they’ve expressed that they want to be involved in more of a creative loop where they can enter numerous prompts, work on multiple tasks at once, and adjust the design as the agent is working. 

“The way to solve that is parallelism, to create multiple agent loops and have them work on these independent features while allowing you to do the creative work at the same time,” he said. 

Agents require a cultural shift

Beyond the technical perspective, there’s a cultural hurdle: Agents operate probabilistically, but traditional enterprises are structured around deterministic processes, noted Mike Clark, director of product development at Google Cloud. This creates a cultural and operational mismatch as LLMs steam in with all-new tools, orchestration frameworks and processes. 

“We don't know how to think about agents,” Clark said. “We don't know how to solve for what agents can do.”

The companies doing it right are being driven by bottoms-up processes, he noted: no-code and low-code software and tool creation in the trenches funneling up to larger agents. As of yet, the deployments that are successful are narrow, carefully scoped and heavily supervised. 

“If I look at 2025 and this promise of it being the year of agents, it was the year a lot of folks spent building prototypes,” Clark said. “Now we’re in the middle of this huge scale phase.”

How do you secure a pasture-less world?

Another struggle is AI agent security, which also requires a rethink of traditional processes, Clark noted.  

Security perimeters have been drawn around everything — but that doesn’t work when agents need to be able to access many different resources to make the best decisions, said Clark. 

“It's really changing our security models, changing our base level,” he said. “What does least privilege mean in a pasture-less defenseless world?”

Ultimately, there must be a governance rethink on the part of the whole industry, and enterprises must align on a threat model around agents. 

Clark pointed out the disparity: “If you look at some of your governance processes, you'll be very surprised that the origin of those processes was somebody on an IBM electric typewriter typing in triplicate and handing that to three people. That is not the world we live in today.” 

Source link

spot_img

Related articles

We Taught AI to Talk — Now It's Learning to Talk to Itself: A Deep Dive

A Master Blueprint for the Next Era of Human-AI Interaction In the rapidly evolving world of artificial intelligence, prompt...

Bethesda Releases Skyrim Switch 2 Patch Addressing Input Latency Issues

The Nintendo Switch 2 port of The Elder Scrolls V: Skyrim hasn't exactly been well received for a...

A new campaign by the ForumTroll APT group

Introduction In March 2025, we discovered Operation ForumTroll, a series of sophisticated cyberattacks exploiting the CVE-2025-2783 vulnerability in Google...