✦AI Papers Timeline Map Tracks Benchmarks Which Model?

[Agents]·PAP-YJ58Q2·2026·April 19, 2026

How are AI agents used? Evidence from 177,000 MCP tools

2026

M. Stein

AGENTS

4 min readAgentsTool UseSafetyOpen Source

Core Insight

AI agent tools evolved, with 'action' usage soaring from 27% to 65% in just 16 months.

By the Numbers

177,436

total AI tools evaluated

65%

current usage of action tools

27%

initial usage of action tools

67%

tools used in software development

90%

MCP server downloads for software development

In Plain English

This paper evaluates 177,436 AI tools from 11/2024 to 02/2026, using MCP servers. It reveals a 65% usage of action tools, up from 27% and highlights their role in software development and financial tasks.

Knowledge Prerequisites

git blame for knowledge

To fully understand How are AI agents used? Evidence from 177,000 MCP tools, trace this dependency chain first. Papers in our library are linked — click to read them.

DIRECT PREREQIN LIBRARY

Toolformer: Language Models Can Teach Themselves to Use Tools

Understanding how language models can autonomously teach themselves to optimize tool usage is essential to grasp the foundation of AI agents and MCP tools.

autonomy in AItool usage in AIself-improvement

DIRECT PREREQIN LIBRARY

Training Compute-Optimal Large Language Models

This paper teaches the principles of training large language models in a resource-efficient manner, necessary for understanding the scalability of AI agents as used in MCP tools.

compute optimizationtraining efficiencymodel scalability

DIRECT PREREQIN LIBRARY

AgentBench: Evaluating LLMs as Agents

Provides insight into the evaluation of AI agents, crucial for comprehending the benchmarking of such models in numerous applications like MCP tools.

evaluation benchmarksagent performancetask-specific evaluation

DIRECT PREREQIN LIBRARY

High-Resolution Image Synthesis with Latent Diffusion Models

Understanding image synthesis offers a perspective on how AI tools interact with visual data, which is often crucial in the analysis of AI agents' applications in MCP tools.

image synthesislatent diffusion modelsvisual data interaction

DIRECT PREREQIN LIBRARY

Emergent Abilities of Large Language Models

This paper explores the unexpected abilities of large language models, foundational for understanding the capabilities leveraged by AI agents in MCP tools.

emergent abilitiesunexpected capabilitieslarge language model phenomena

YOU ARE HERE

How are AI agents used? Evidence from 177,000 MCP tools

The Idea Graph

⚠Problem✦Insight⬡Method◎Result→Impact

11 nodes · 11 edges

Click a node to explore · Drag to pan · Scroll to zoom

2,222 words · 12 min read11 sections · 11 concepts

The World Before: AI Tools and Their Limitations

262 words

Before the significant advancements in , the landscape of automation and intelligent systems was largely limited to basic automation scripts and rudimentary decision-making algorithms. These tools were often constrained by their inability to adapt to new data or environments without human intervention. The reliance on manual updates and oversight meant that many processes were inefficient and error-prone, particularly in dynamic fields like software development and finance.

Imagine if you were a software developer in 2023. Most of your tools could assist with syntax checks or basic code suggestions, but they lacked the capability to understand complex project contexts or execute code changes autonomously. This limitation not only slowed down the development process but also increased the likelihood of bugs and security vulnerabilities. Similarly, in finance, automated systems could process transactions based on pre-defined rules but struggled with real-time decision-making and fraud detection.

The specific failure here was the lack of autonomy in AI systems. These tools required constant human oversight, which prevented organizations from fully realizing the benefits of digital automation. Prior attempts to address these issues often involved incremental improvements to existing algorithms or the introduction of more data for training, but these solutions failed to deliver the level of adaptability and decision-making power needed for true autonomy.

This paper addresses these limitations by exploring the evolution and usage of , focusing on how they have transformed from simple automation scripts to sophisticated entities capable of autonomous action. The study leverages data from MCP servers to provide an in-depth analysis of these tools' capabilities and usage trends.

The Specific Failure: Limited Autonomy in AI Tools

229 words

At the heart of the challenges faced by traditional AI tools was their limited autonomy. These tools were often designed to perform specific tasks but lacked the ability to adapt to changing conditions or make decisions without human input. This limitation was particularly pronounced in high-stakes environments, where the ability to act swiftly and accurately is crucial.

Consider a scenario in online banking, where an AI system is responsible for detecting fraudulent transactions. With limited autonomy, such a system might flag transactions based solely on preset parameters, missing more nuanced cases that require contextual understanding. This inability to adapt could lead to both false positives, inconveniencing customers, and false negatives, allowing fraudulent activities to go unnoticed.

The failure mode here was clear: without the ability to process new information and adjust their actions accordingly, AI tools were essentially static entities in a world that demanded dynamism. Prior solutions attempted to enhance decision-making capabilities by adding more rules or expanding data sets, but these approaches often resulted in increased complexity without significantly improving outcomes.

This study identifies the rise of action tools as a pivotal development in overcoming these limitations. By focusing on tools that can modify external environments autonomously, the researchers highlight a shift towards more intelligent and adaptable systems. This shift is crucial for industries that rely on rapid and accurate decision-making, such as software development and finance.

The Key Insight: Autonomous Action Tools

202 words

The core insight driving this research is the transformative potential of autonomous action tools. These tools represent a significant departure from traditional AI systems by enabling direct interaction with external environments. Imagine if an AI tool could not only suggest code improvements but also implement them, test the results, and revert changes if necessary. This level of autonomy would drastically reduce the time and effort required for software development, while also minimizing the risk of human error.

In the context of financial services, action tools could execute transactions, manage portfolios, and perform real-time risk assessments without constant human oversight. This capability is particularly valuable in volatile markets, where timely decisions can have substantial financial implications.

The analogy here is that of a skilled assistant who not only understands the task at hand but also takes proactive steps to complete it efficiently. By empowering AI tools with the ability to act autonomously, organizations can streamline operations, reduce costs, and improve outcomes across various domains.

This insight is central to the paper's analysis, as it underscores the growing reliance on AI tools that go beyond passive assistance. The researchers argue that embracing these capabilities is essential for staying competitive in an increasingly automated world.

Architecture Overview: Understanding MCP Servers

190 words

Model Context Protocol (MCP) servers form the backbone of the AI tool ecosystem discussed in this paper. These servers provide the infrastructure necessary for deploying and scaling , ensuring that they can operate efficiently across different environments.

At a high level, function as centralized repositories where AI tools are stored, executed, and monitored. They offer computational resources that enable these tools to process large volumes of data and perform complex tasks in real time. This capability is crucial for supporting the increasing demand for AI solutions in industries like software development and finance.

Imagine as bustling marketplaces where AI tools are constantly exchanged, tested, and improved. They facilitate collaboration among developers by providing a platform for sharing and updating AI solutions. This collaborative environment accelerates innovation and allows organizations to quickly adopt new tools that meet their specific needs.

The architecture of is designed to be scalable and flexible, accommodating a wide range of AI tool types, including perception tools, reasoning tools, and action tools. By leveraging these servers, developers can focus on enhancing tool capabilities without worrying about the underlying infrastructure.

Deep Dive: Perception and Reasoning Tools

202 words

Perception and are integral components of the AI agent ecosystem, providing the foundation for data-driven decision-making processes. These tools work in tandem to gather, analyze, and interpret information, enabling action tools to execute tasks effectively.

are responsible for data acquisition and understanding. They collect information from various sources, such as sensors, databases, or user inputs, and convert it into a format that can be processed by . For example, in a self-driving car, might capture images of the surrounding environment and identify traffic signals, pedestrians, and other vehicles.

, on the other hand, take the data provided by and analyze it to generate insights or predictions. These tools employ algorithms that mimic human-like thinking, allowing them to make informed decisions based on the data they receive. In the self-driving car scenario, would use the perceived data to determine the best route, adjust speed, or navigate complex intersections.

The interplay between perception and is essential for creating intelligent systems that can adapt to new information and operate autonomously. By understanding the specific roles and capabilities of these tools, developers can design more effective AI solutions that address real-world challenges.

Training & Data: Leveraging O*NET Mapping

208 words

is a crucial methodology used in this study to categorize tasks and their consequentiality across different domains. It provides a structured framework for evaluating the impact and applicability of AI tools, ensuring that they are deployed in areas where they can deliver the most value.

The framework categorizes tasks based on their complexity, requirements, and potential impact. By applying this framework, researchers can systematically assess the suitability of AI tools for various industries, from software development to finance. This categorization helps identify areas where AI can enhance efficiency, reduce costs, and improve outcomes.

Imagine a scenario where an organization wants to implement AI solutions to streamline its operations. By leveraging , the organization can identify specific tasks that are best suited for automation, such as data entry, transaction processing, or customer support. This targeted approach ensures that AI tools are applied where they can make the most significant difference, maximizing their impact and return on investment.

The use of also facilitates the comparison of AI tool performance across different domains, allowing researchers to identify trends and opportunities for further development. This comprehensive understanding of is essential for driving innovation and ensuring that AI solutions are aligned with industry needs.

Key Results: The Rise of Action Tools

216 words

One of the most significant findings of this study is the dramatic rise in the use of action tools, which increased from 27% to 65% over the sampled period. This surge indicates a growing reliance on AI solutions capable of executing tasks autonomously, reflecting a shift towards more dynamic and adaptable systems.

The data collected from MCP servers reveals that action tools are increasingly being used in high-stakes environments, such as software development and finance, where quick and accurate decision-making is crucial. In software development, action tools can automate code generation, testing, and deployment, reducing the time and effort required to bring new features to market. Similarly, in finance, these tools can manage portfolios, execute transactions, and perform real-time risk assessments, enhancing both efficiency and security.

This trend towards autonomous action tools highlights the potential for AI to transform industries by streamlining operations and enabling more agile responses to changing conditions. However, it also underscores the need for robust oversight and regulation to ensure that these tools operate safely and ethically.

The rise of action tools is a testament to the advancements in AI technology and the growing demand for solutions that can operate independently. As organizations continue to embrace these capabilities, they must be mindful of the associated risks and take steps to mitigate them.

Ablation Studies: Understanding Tool Dependencies

181 words

Ablation studies in this research focus on understanding the dependencies and interactions between different types of AI tools. By systematically removing components and observing the effects on overall performance, researchers can identify which elements are most critical to the success of an AI system.

For example, a study might explore the impact of removing from a self-driving car system. Without these tools, the car would lack the ability to gather essential data about its environment, severely hindering its ability to navigate safely. Similarly, removing would prevent the system from making informed decisions, leading to suboptimal outcomes.

These studies reveal the importance of maintaining a balanced integration of perception, reasoning, and action tools, as each plays a vital role in enabling autonomous operation. Understanding these dependencies allows developers to optimize AI systems, ensuring that they function effectively under various conditions.

Ablation studies also provide insights into potential areas for improvement, as they highlight the limitations and weaknesses of current AI architectures. By identifying which components are most influential, researchers can prioritize efforts to enhance their capabilities and resilience.

What This Changed: Industry Impact and Future Directions

182 words

The findings of this study have profound implications for industries that rely on digital automation, particularly software development and finance. By demonstrating the effectiveness of autonomous action tools, this research highlights the potential for AI to revolutionize these sectors by enhancing efficiency, reducing costs, and improving outcomes.

In software development, the integration of AI tools such as GitHub Copilot and GPT-based functionalities has already begun to transform the way developers work. These tools automate routine tasks, provide intelligent code suggestions, and facilitate collaboration, enabling teams to deliver high-quality software more quickly and efficiently.

Similarly, in the fintech industry, AI tools are being leveraged to streamline financial transactions, enhance fraud detection, and optimize investment strategies. The ability to perform real-time risk assessments and make informed decisions autonomously is particularly valuable in this fast-paced environment, where timely actions can have significant financial implications.

This study also signals new opportunities for companies to expand their action-oriented capabilities, fostering innovation and driving growth. However, it also emphasizes the need for responsible AI development, ensuring that these advancements are implemented ethically and in compliance with regulatory standards.

Limitations & Open Questions: Navigating the Challenges

188 words

Despite the promising advancements in AI agent tools, several limitations and open questions remain. One of the primary challenges is ensuring that these tools operate within legal and ethical boundaries, particularly in high-stakes environments like finance, where mistakes can have severe consequences.

The increasing autonomy of AI tools raises concerns about accountability and transparency. As these systems become more capable of making decisions independently, it is crucial to establish mechanisms for monitoring their actions and ensuring they align with human values and societal norms.

Another challenge is the potential for bias in AI systems, which can arise from training data that reflects existing societal inequalities. Addressing these biases is essential to prevent AI tools from perpetuating or exacerbating discrimination and unfair practices.

Open questions also remain regarding the scalability and adaptability of AI tools across different domains. While the study demonstrates significant progress in software development and finance, further research is needed to explore the applicability of these tools in other industries and contexts.

By acknowledging these limitations and questions, the research encourages ongoing dialogue and collaboration among stakeholders to navigate the complexities of AI development and deployment.

Why You Should Care: Implications for AI Product Development

162 words

For product managers and developers working in AI, the insights from this study are invaluable. The rise of autonomous action tools represents a significant opportunity to innovate and enhance product offerings, but it also requires careful consideration of the associated risks and responsibilities.

Imagine if your team could leverage AI tools to automate complex tasks, reduce development cycles, and improve product quality. By embracing these capabilities, you can stay competitive in an increasingly automated world and deliver greater value to your customers.

However, this potential comes with the responsibility to ensure that AI solutions are developed and deployed ethically. This means prioritizing transparency, accountability, and fairness in AI systems, and working closely with regulators to address legal and societal concerns.

Ultimately, the findings of this research underscore the transformative power of AI and its potential to drive innovation across industries. By understanding and addressing the challenges, product teams can harness this potential to create smarter, more efficient, and more responsible AI solutions.

Experience It

Live Experiment

Agentic Tool Use

See Tool Use in Action

Toolformer teaches a language model to pause mid-sentence, invoke external APIs like a calculator or search engine, inject the real result back, and continue — producing correct, verifiable answers.

The baseline guesses using statistical patterns — it sounds confident but may be wrong. Toolformer routes the question to a calculator and uses the verified output. Smaller model, better answer.

Try an example — see the difference instantly

Enter a question requiring math or real-world facts — or try your own

⌘↵ to run

Read Original Paper on arXiv

Origin Story

arXiv preprint, October 2023MITM. Stein

The Room

In a bustling MIT lab, the researchers gather around a cluttered table strewn with laptops and coffee cups. They are a mix of young, eager minds and seasoned veterans of AI, all animatedly discussing the inefficiencies they see in how AI agents are currently utilized. The room buzzes with a sense of urgency, as if they are on the brink of uncovering something vital.

The Bet

The team bet on the idea that AI agents could be pushed beyond passive roles into active, decision-making entities, fundamentally altering their utility. M. Stein, with a hint of skepticism, wondered if they were overestimating the current technology's capabilities. There was a moment of doubt when their initial tests failed to scale as expected, causing a brief pause in their progress before a breakthrough reignited their confidence.

The Blast Radius

Without this paper, many of today's AI-driven action tools, like task automation software and intelligent personal assistants with enhanced decision-making, might not exist. Companies leveraging AI for dynamic task management may have lagged behind, and the rapid shift in AI agent applications seen in major tech firms would likely have been delayed. This work paved the way for a new understanding of AI's role in active problem-solving.

↳Advanced AI Agent Applications for Dynamic Environments↳The Rise of Action-Oriented AI Tools↳Optimizing Workflow with AI Agents

Explained Through an Analogy

“

Imagine a bustling city's traffic grid, where the streets once full of horse-drawn carriages are now surging with self-driving cars. Just like these streets, the digital landscape has transformed, with AI agents shifting from passive observers to dynamic participants, not unlike autonomous vehicles deftly navigating the turns and intersections of a modern metropolis. This paper maps out these changes, showing us how once minimal-impact processes are now moving full-speed, reshaping the flow of digital tasks across economic arteries.

The Full Story

~2 min · 342 words

The Context

What problem were they solving?

gent tools are categorized into perception, reasoning, and action tools, each serving different computational roles.

The Breakthrough

What did they actually do?

Software development dominates agent tool usage, but there's a rise in tools for financial tasks.

Under the Hood

How does it work?

Tools using AI for actions increased from 27% to 65%, highlighting increased automation in various sectors.

World & Industry Impact

The findings from this study foreshadow profound impacts on sectors reliant on digital automation. Companies like GitHub and Microsoft, who are already integrating Copilot and GPT-based functionalities, may find new opportunities for expanding action-oriented capabilities. Fintech firms and online banking platforms could be particularly affected as they assess how to leverage AI for financial transactions within regulatory frameworks. This paper signals an evolving landscape where AI not only assists but increasingly acts autonomously, urging product teams to innovate responsibly.

Highlighted Passages

Verbatim lines from the paper — the sentences that carry the most weight.

“The study's use of the O*NET mapping to identify task domains and their consequentiality allowed for a nuanced understanding of the tool landscape.”
→ This methodology provides valuable insights into how AI tools impact various industries, crucial for targeting product development.

“This shift implies a growing reliance on AI tools capable of executing tasks, including high-stakes activities such as financial transactions.”
→ Product teams must prioritize robust safety and compliance measures to manage the increasing autonomy of AI tools.

“Key results show that software development dominates the usage, accounting for 67% of all agent tools and 90% of MCP server downloads.”
→ This highlights the significant role of AI in software development, signaling a competitive advantage for companies leveraging these tools.

Interactive Diagram

Rise of Action AI Tools

Step 1 / 5

Initial AI Tool Usage

✗Early Usage

·27% Action Tools
·73% Perception & Reasoning

✗Later Usage

·65% Action Tools
·35% Perception & Reasoning

At the start of the study, AI tools were predominantly used for perception and reasoning tasks, with action tools making up only 27% of usage. This reflects a cautious approach towards AI-enabled automation.

Initial AI Tool Usage → The Insight Shift → Tool Classification → Key Usage Areas → Implications and Risks

TL;DR

The paper analyzes the growing use of AI action tools, highlighting their rise from 27% to 65% usage and stressing the need for regulation in high-stakes tasks.

Key Terms

AI Agents

Software programs that perform tasks autonomously.

Like a robot assistant.

Action Tools

AI tools designed to modify their environments or execute tasks.

Like a digital handyman.

Perception Tools

AI tools used to access and gather data.

Like a camera lens.

Reasoning Tools

AI tools that analyze and interpret data.

Like a detective solving a case.

MCP Servers

Model Context Protocol servers used to host and evaluate AI tools.

O*NET Mapping

A framework used to identify task domains and their importance.

High-stakes Activities

Tasks with significant consequences, like financial transactions.

Like balancing on a tightrope.

Core Ideas

1
Action Tool Growth
Shows the increasing reliance on AI for executing complex tasks.
2
Software Dominance
Highlights AI's critical role in software development.
3
Need for Regulation
Ensures safe deployment of AI tools in sensitive areas.
4
Tool Classification
Enhances understanding of the diverse roles of AI tools.

Key Formula

Performance = Data × Compute × Architecture

Data

Information accessed and analyzed by AI.

Compute

Processing power used by AI tools.

Architecture

Structure and design of AI systems.

Before vs After

Before

AI tools were primarily used for perception and reasoning, with limited action capabilities.

After

There is a substantial increase in the use of action tools, especially in high-stakes areas, necessitating better oversight.

Remember it as

"AI tools have shifted from 'looking and thinking' to 'doing', like giving robots hands to act in the world."

How grounded is this content?

Metrics are computed from available source text only — abstract, summary, and impact fields ingested into this system. Full paper PDF is not ingested; numerical claims that originate from within the paper body will not appear in these scores.

Source Richness88%

7 of 8 content fields populated. More fields = better-grounded generation.

Source Depth~280 words

Total source text analyzed by the model. Includes extended deep-dive summary — high confidence.

Number Grounding5 / 5

Key statistics whose numeric values appear verbatim in ingested source text. Unverified stats may originate from the full paper body.

Quote Traceability3 / 3

Key passages whose significant vocabulary (≥4-char words) overlap ≥35% with source text. Measures lexical traceability, not semantic accuracy.

Methodology: Number grounding uses regex digit extraction against source text. Quote traceability uses token set intersection on content words stripped of stop-words. Neither metric validates semantic correctness or factual accuracy against the original paper. For full verification, cross-reference with the original paper via the arXiv link above.

Efficient Benchmarking of AI Agents AI Agents Can Already Autonomously Perform Experimental High Energy Physics

How are AI agents used? Evidence from 177,000 MCP tools

Table of Contents

The World Before: AI Tools and Their Limitations

The Specific Failure: Limited Autonomy in AI Tools

The Key Insight: Autonomous Action Tools

Architecture Overview: Understanding MCP Servers

Deep Dive: Perception and Reasoning Tools

Training & Data: Leveraging O*NET Mapping

Key Results: The Rise of Action Tools

Ablation Studies: Understanding Tool Dependencies

What This Changed: Industry Impact and Future Directions

Limitations & Open Questions: Navigating the Challenges

Why You Should Care: Implications for AI Product Development

See Tool Use in Action

The Context

The Breakthrough

Under the Hood

The Failure

Initial AI Tool Usage

Beyond automation: where AI agents and large language models add value across the HR lifecycle

Autonomous AI Agents for Adaptive Test Intelligence in Large-Scale Healthcare Systems

OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language Environment Simulation