For years, artificial intelligence has been a strategic priority for major tech companies, and Microsoft is no exception. The company has invested heavily in research, infrastructure, and specialised teams to advance AI far beyond the era of simple text-based models.
Today’s systems are expected to understand context, interact with interfaces, automate tasks, and even execute multi-step workflows—capabilities that go far beyond generating text.
OpenAI has captured global attention with powerful models such as GPT-4o, pushing the boundaries of what AI can do. But Microsoft is determined not to fall behind. As reported by Windows Central, the company has unveiled a new AI agent designed to surpass GPT-4o in certain areas—especially when it comes to running directly on local hardware rather than relying on the cloud.
Microsoft Introduces “Fara-7B”: A Local AI Agent for Complex PC Tasks
OpenAI’s GPT models have become the benchmark in the AI world, setting a high bar for performance and versatility. Microsoft, however, is steadily carving its own path. In addition to its close collaboration with OpenAI, the company has developed internal AI systems and recently formed a dedicated research group called MAI Superintelligence, whose mission is to push the next frontier of advanced AI.
The latest result of that work is Fara-7B—a new AI agent built specifically for executing complex computer tasks on behalf of the user, and most importantly, doing so entirely on-device.
According to early information shared by VentureBeat, Fara-7B is a vision- and action-driven model with over 7 billion parameters, designed to run efficiently on smaller systems. By operating locally, it offers:
- Higher privacy, since no data has to be sent to the cloud
- Lower latency, enabling faster response times
- Safer automation, especially for workflows involving sensitive or confidential information
This positions Fara-7B as an agent that can control interfaces, perform actions, read screens, and navigate applications without compromising user data.
A Local AI Agent With Advanced Interaction Capabilities
Though still in the experimental stage, Fara-7B demonstrates abilities typically seen only in large cloud-based models:
1. Runs directly on a user’s PC
The model is optimised for consumer-level hardware, making it a practical solution for businesses and individuals who need local AI power for sensitive tasks.
2. Interacts with user interfaces like a human
Fara-7B can understand webpages and applications through:
- Screenshots
- Pixel-level visual data
- Predicted coordinate mapping
This means it can determine where to click, scroll, or type—mimicking human interaction without needing access to internal browser code.
3. Works even with complex or non-standard webpages
Unlike many web automation agents that rely on structured HTML or accessibility APIs, Fara-7B extracts meaning from raw visual input. This allows it to work with:
- Heavily scripted sites
- Poorly structured code
- Proprietary interfaces
Its independence from underlying code makes it far more adaptable than traditional automation tools.

Outperforming Cloud Models Like GPT-4o
One of the most striking discoveries is that Fara-7B has outperformed GPT-4o in specialised web agent benchmark tests. In a recent evaluation:
- Fara-7B scored 73.5%
- GPT-4o scored 65.1%
This suggests that while GPT-4o dominates many language and reasoning benchmarks, Microsoft’s new model is highly optimized for real-world action tasks, where vision, UI understanding, and step-by-step execution matter more than linguistic depth.
Microsoft also noted that Fara-7B is trained to identify “critical actions” to reduce the risk of agents making dangerous or irreversible changes during automation. This safety layer is essential for scenarios like handling financial data, accessing internal tools, or performing system-level operations.
