AI agents will succeed because one tool is better than ten


Three years after Pandora’s box of LLMs was thrown open, companies are no longer talking about chatbots. Instead, every podcast pitch I get mentions AI agents. It’s the new hotness, sure, but people creating tech see a lot of opportunity to build the next level of fundamental computer science concerns: abstraction and automation.

Part of what agents address is something we’ve talked about here on the blog for a while: context switching. Developers love a good flow state, but the vast pile of tools and notifications and red circles and flashing apps distract us and pull us away from the work that we’re doing. Whether the AI industry is a bubble set to burst is as yet unknown, but pop or not, I think AI agents as a way of thinking about AI (and knowledge/technology work in general) will last.

Early in the AI boom, Isaac Lyman wrote a very sharp take that’s guided a lot of my thinking since: AI isn’t the app, it’s the UI. AI agents take this one step further: they are a natural language interface to use every piece of software in your workflow. They can chat, use tools, and write new code all from one interface. And that single interface—that one entry point to your entire productive life—will be the reason agents last.

User interfaces in computers have gone from pure text terminals in old MS-DOS and Unix-based systems to highly optimized and configured graphical interfaces that interact with specific software and functionality only. Large language models (LLM) bring us back to that pure text interface, except this time you don’t need to know arcane and esoteric power words to make the most of the environment (shoutout to all prisoners of Vim). You can just speak or type in your native language and get responses, no secret knowledge needed (though prompt engineers may disagree).

AI agents take that language as interface and add tool use. There have been cross-application APIs and workarounds for a while now, but that access is being standardized with the current push to make agents use APIs and build model-context protocol (MCP) servers around existing software. Now, the competitive space in AI isn’t the foundation model; it’s the agentic orchestrator. Natural language with tool use means that there may be a single entry point into your entire suite of tools, techs, and SaaS products.

In modern software enterprises, developers have access to a ton of tools, but they don’t necessarily know how to use them effectively. Consider all the infrastructure, CI/CD, testing, open-source dependencies, security, observability, data management, etc. tools in a modern stack. Developers can waste up to four hours of their work week toggling between these tools. If you’re building a generative AI feature, that’s another pile of tools that are complex and new. Maryam Ashoori, Head of Product for watsonx.ai at IBM, ran a survey and found that developers use five to 15 tools to build GenAI systems alone. “The majority of them said they can’t afford to spend more than two hours on learning a new tool,” she concluded.

Maintaining expertise in all those tools (much less knowing that they exist) can take up a lot of brain space. When you come at a problem—like how to build a software system or feature—you may not understand the breadth of the tools and technologies that this feature touches. “If you work in a large company, there may be dozens if not hundreds of systems,” said Christophe Coenraets, SVP of Developer Relations at Salesforce. “An agent gives you that conversational interface where you can simply say what you want to do. The agent will figure out how to do it right.”

That’s not to say you can skip all your infosec and DevOps reviews or vibe code your way past architectural documents. You’ll still need to plan out a system and have all the stakeholders review it, but an agentic first pass can identify all the little pieces that might get nudged along the way. You can focus on the system design and let your little robot interns read the docs for all your OSS dependencies.

You can see how this would make it easy to stay put in one single window. Many devs live in their terminal or IDE anyway, so providing an automated way to use natural language to access software outside that terminal will embed those devs deeper into their favorite terminal or IDE (“I’m not a prisoner, I live in Vim by choice!”).

The terminal, being a text interface that allows tool use, could be the one interface for developers of the future. “The terminal is already a place where there’s a concept of a long running task,” said Zach Lloyd, founder and CEO of Warp, an agentic terminal application. “It already allows for multitasking. A lot of the primitives for an agentic future are there in the terminal, which would be a crazy full-cycle thing if that’s where people end up. I don’t know if that is what will happen, but there is a lot going for it and a lot of value that our users are getting from these agentic features in the terminal right now.”

Of course, not every application (not even every developer application) can run using pure natural language. There’s a pretty good chance that we’ll still need user interfaces in the magical agentic future so we can fiddle with knobs and point at charts. Some folks point to Star Trek as the pioneer in imagining this system: mostly voice interfaces with graphics for specialized tasks. But all that can exist as a dialog box instantiated from your favorite single application, just like how config or advanced settings exist now.

Things get a little weirder when we let an agent build more features, interfaces, and agents with generated code. Imagine a custom UI that adapts to your use cases and needs, where any feature that you need could be added in real time with a prompt. Google has a research demo for this, and while the current implementation is mildly janky, the possibilities are vast and uncharted. “This is the last technology period because everything else will be developed by AI already,” said Illia Polosukhin, co-author of the original “Attention Is All You Need” Transformers paper and co-founder of NEAR.

Okay, let’s tap the brakes. There’s a whole lot here that’s a bit starry-eyed and techno-optimist. For this brave new world of single interfaces that run a battalion of rad little agents doing all the dumb work you hate come to fruition, somebody is going to have to do a ton of work to build them. All these agents that our future selves will have spinning gold will need a bunch of programming, testing, and infrastructure.

Platform engineering teams have become a more prominent part of engineering orgs in the past decade or so. They grew out of DevOps teams managing code in production. As code in production increasingly meant microservices running in a cloud-native environment, folks started treating DevOps as a product, building out features that made it easier to run increasingly distributed code. Eventually, this ballooned into developer experience domains, allowing developers to just write business logic without concerning themselves with infrastructure, interconnectivity, and failure prevention and management.

“We provide the infrastructure itself, the ways to provision the infrastructure, the ways to interact with the infrastructure, and also a lot of things about how you actually develop your code in the moment,” said Caitlin Weaver, senior engineering manager at CLEAR. “So not just where your code goes, but what’s inside it and the processes around working with it. There’s a lot of abstraction that we can safely provide and a lot of detail that we can safely hide to reduce the level of complexity for developers.”

For production software, this means shared dependencies, orchestration of infrastructure and traffic, metrics and observability, security, deployment, and more. Agents running in-house on tools either running on owned infrastructure or as paid SaaS products will need much of this, but also come with additional concerns. You may want to run a suite of models for cost efficiency and evaluation purposes—that needs routing infrastructure. You’ll need code to implement guardrails and governance of prompts and responses. You’ll need a framework that routes tool calls to the right place with the right structured calls and the appropriate auth. You might even need to spin up a few MCP servers.

This agent infrastructure serves the same purpose as production infrastructure: devs only need to think about how to implement the agent logic. Most of the platform engineering work serves to make a system DRYer: everybody has to perform these actions in production, but you don’t want everyone repeating work. “We don’t want the developers to reinvent the wheel every time they’re building an agent,” said Marco Palladino, CTO of Kong. “There are lots of crosscutting requirements that every agent needs to have. The platform teams—now the ball is in their court. Come up with a platform that can help all of these developers build agents that are, by default, secure, observable, governable, and so on.”

A lot of what people will be doing with agents involves data—often proprietary and sensitive to the business. Let’s face it: plenty of apps are just fancy CRUD interfaces, so a lot of the common agentic concerns will be around data access and management. Agents excel (pun intended) at data processing—say, analyze the last three months of traffic logs—so your agentic system will need to connect to data sources safely and funnel that to applications that can use it. That takes a fair bit of planning to get right. “How do I connect the right data?” asked Jeff Hollan, director of product at Snowflake. “How do I clean the data? How do I get the data presentable? All of those tasks that data scientists, and data engineers, and data analysts are doing, can we help them do in an hour what maybe would’ve taken them a day?”

But one of the simpler things that a platform engineering team can do is make visible the capabilities and connections within a system and engineering org. When companies scale, it’s pretty easy to lose track of what’s available, even in terms of the software built by the engineering org. How many seats do you have for any given application? What applications do you have seats for? What MCP servers do you have running? Sometimes just having that registry of tools can inspire people to figure out how to connect them.

For consumers, major AI providers are building these capabilities into their products. Any LLM that had a chatbot front end six months ago now has agentic workflows available as plugins. You can build your own workflow there, though you may end up paying for it. If you already have an ecosystem of tools flourishing on your networks, you’re going to want to get your experts to build a platform for those tools to connect to agents.

Like all time-saving automations, the single agentic interface will take work to implement. Ah, but what a world that could be. Plenty of organizations are building the glue and systems that will help you connect to their tools, but you’ll need a platform if you intend to connect to everything that your organization runs internally.

If nothing else, maybe I could close a few of the hundred or so tabs I have open. A boy can dream.



Source link