Skip to Content Facebook Feature Image

Global SOTA on Dual Benchmarks! MiningLamp Technology's Specialized GUI Model Mano Unveils New Era of Intelligent GUI Operation

Business

Global SOTA on Dual Benchmarks! MiningLamp Technology's Specialized GUI Model Mano Unveils New Era of Intelligent GUI Operation
Business

Business

Global SOTA on Dual Benchmarks! MiningLamp Technology's Specialized GUI Model Mano Unveils New Era of Intelligent GUI Operation

2025-10-06 18:39 Last Updated At:18:55

BEIJING, Oct. 6, 2025 /PRNewswire/ -- In 2025, "Agent" is undoubtedly a buzzword in the AI community. It is widely believed that truly useful Agents must learn to use mobile phones and computers, and interact with GUI (Graphical User Interface) just like humans.

Recently, MiningLamp Technology—the leading Chinese enterprise in enterprise-level large models and data intelligence—announced that its specialized GUI Large Model Mano achieved record-breaking SOTA (State of the Art) performance on two industry-recognized benchmark tests: Mind2Web and OSWorld. Through two core innovations—online reinforcement learning and automated training data acquisition—Mano establishes a scalable and self-evolving paradigm for GUI agent development.

Ranking list link: https://os-world.github.io/
Technical report link: https://www.mininglamp.com/news/6394/

Key Breakthroughs:

The technical report reveals that in the Foundation E2E GUI & Specialized Model evaluation on the OSWorld-Verified leaderboard, Mano directly boosted the success rate to 41.6 ± 0.7%, surpassing models such as Qwen, GUI-Owl, and OpenCua.

Technical Innovations:

Highlight One: First Proposal of "Online Reinforcement Learning"

Since the emergence of DeepSeek, GRPO has become the gold standard in reinforcement learning. Currently, most model training is still confined to the realm of offline reinforcement learning, relying on pre-collected datasets. However, in the field of GUI-based interactive agents, every operation is closely related to the real system interaction environment.

Therefore, Mano first proposed the "online reinforcement learning" training paradigm in the field of GUI interaction, and launched an "explorer" for automated training data acquisition, which enables the agent to continuously learn from the up-to-date information, keeping a dynamic balance between "trying new actions to gain insights" and "executing optimal moves based on existing knowledge."

To continuously enhance adaptability and flexibility in real-world interaction scenarios, MiningLamp Technology established a simulation environment pool, encompassing Browser Use Agent (BUA) and Computer Use Agent (CUA) environments, enabling models to gather diverse environmental data  through real-world interactions. This approach addresses the limitations of sparse distribution of offline trajectories, ultimately demonstrating greater robustness across a wide range of Web GUI scenarios.

Meanwhile, MiningLamp Technology employs an innovative approach—online sampling + offline filtering. First, collecting trajectories, and then filtering out noisy data. This method dynamically adjusts the distribution of task difficulty, effectively preventing learning inefficiency caused by failed trajectories.

Ablation study results show that after adding online reinforcement learning, the model achieved a significant improvement in its average score on the OSWorld-Verified dataset, surpassing the performance of the offline reinforcement learning model by 7.9 points, reaching a total of 41.6.

Highlight Two: Intelligent Exploration, Capturing Real-World Environmental Trajectories

Although large models can understand broad instructions, they often struggle to break down complex, goal-driven tasks requiring multiple steps into specific execution steps. As a result, developers need to develop specialized models and agents for interactive tasks. This process requires massive volumes of high-quality interactive trajectory data. Historically, such data has typically required manual construction or annotation, which was both costly and time-consuming. To address this challenge, MiningLamp Technology has designed an automated method for collecting training data, fundamentally boosting both the efficiency and accuracy of data acquisition. This is precisely Mano's second major innovation.

MiningLamp Technology has built a scalable virtual environment cluster designed to simulate a variety of interactive scenarios. For each target application, the large model automatically generates a target list, prioritizes these targets, filters out functions with extremely low usage frequency, and provides clear contextual guidance for subsequent exploration.

In terms of element extraction, MiningLamp Technology has customized a Chrome plugin called "Mano-C" specifically for web environments, comprehensively extracting interactive elements and capturing their spatial coordinates and semantic attributes. For desktop environments, the technical team employs a combined approach using A11y Tree parsing and OmniParseV2 collaborative filtering, ensuring broader coverage of interactive elements.

In terms of data annotation, MiningLamp Technology has generated semantic labels, functional descriptions, and interaction categories for each extracted element by large models, thereby forming structured semantically aligned data that provides effective supervision for subsequent training.

To enhance the intelligence of data collection, the technical team designed a Prompt-based exploration module that intelligently selects interactive elements, and introduced explicit constraints to prevent path loops and redundant branches. During the exploration process, a depth-first search (DFS) strategy is employed. The system captures screenshots and saves annotated interaction data. Once exploration is completed, filtering out high-quality interaction sequences by trajectory evaluation mechanism. The entire process runs in a continuous loop, checking at each step whether it reaches the maximum exploration depth.

Mano's state-of-the-art (SOTA) performance is attributed to MiningLamp Technology's years of accumulation in large models. In 2024, MiningLamp Technology's hypergraph multimodal large language model (HMLLM) and Video-SME dataset made significant breakthroughs in non-standard modality data processing (e.g., EEG, eye-tracking), recognized by ACM MM 2024 Best Paper Nomination.In 2025, MiningLamp Technology launched DeepMiner, a trustworthy intelligent agent for business data analysis .As DeepMiner's automated execution engine, Mano has enabled the agent to truly learn to "see" and "click," achieving precise operations in complex software and browser environments. Looking ahead, MiningLamp Technology will further optimize Mano's capabilities for application and edge-side deployment, accelerating the pace of enterprise intelligent transformation.

** The press release content is from PR Newswire. Bastille Post is not involved in its creation. **

Global SOTA on Dual Benchmarks! MiningLamp Technology's Specialized GUI Model Mano Unveils New Era of Intelligent GUI Operation

Global SOTA on Dual Benchmarks! MiningLamp Technology's Specialized GUI Model Mano Unveils New Era of Intelligent GUI Operation

SAN MATEO, Calif., Dec. 13, 2025 /PRNewswire/ -- AI infrastructure company EverMind has recently released EverMemOS, an open-source Memory Operating System designed to address one of artificial intelligence's most profound challenges: equipping machines with scalable, long-term memory.

The Memory Bottleneck

For years, large language models (LLMs) have been constrained by fixed context windows, a limitation that causes "forgetfulness" in long-term tasks. This results in broken context, factual inconsistencies, and an inability to deliver deep personalization or maintain knowledge coherence. The issue extends beyond technical hurdles; it represents an evolutionary bottleneck for AI. An entity without memory cannot exhibit behavioral consistency or initiative, let alone achieve self-evolution. Personalization, consistency, and proactivity, which are considered the hallmarks of intelligence, all depend on a robust memory system.

There is a consensus that memory is becoming the core competitive edge and defining boundary of future AI. Yet existing solutions, such as Retrieval-Augmented Generation (RAG) and fragmented memory systems, remain limited in scope, failing to support both 1-on-1 companion use cases and complex multi-agent enterprise collaboration. Few meet the standard of precision, speed, usability, and adaptability required for widespread adoption. Equipping large models with a high-performance, pluggable memory module remains a core unmet demand across AI applications.

Discoverative Intelligence

"Discoverative Intelligence" is a concept proposed in late 2025 by entrepreneur and philanthropist Chen Tianqiao. Unlike generative AI, which mimics human output by processing existing data, Discoverative Intelligence describes an advanced AI form that actively asks questions, forms testable hypotheses, and discovers new scientific principles. It prioritizes understanding causality and underlying principles over statistical patterns, a shift Chen argues is essential to achieving Artificial General Intelligence (AGI).

Chen contrasted two dominant AI development paths: the "Scaling Path," which relies on expanding parameters, data, and compute power to extrapolate within a search space, and the "Structural Path," which focuses on the "cognitive anatomy" of intelligence and how systems operate over time.

Discoverative Intelligence falls into the latter category, built on a brain-inspired model called Structured Temporal Intelligence (STI) that requires five core capabilities in a closed loop: neural dynamics (sustained, self-organizing activity to keep systems "alive"), long-term memory (storing and selectively forgetting experiences to build knowledge), causal reasoning (inferring "why" events occur), world modeling (an internal simulation of reality for prediction), and metacognition & intrinsic motivation (curiosity-driven exploration, not just external rewards).

Among these capabilities, long-term memory serves as the vital link between time and intelligence, highlighting its indispensable role in the path toward achieving true AGI.

EverMind's Answer

EverMemOS is EverMind's answer to this need: an open-source Memory Operating System designed as foundational technology for Discoverative Intelligence. Inspired by the hierarchical organization of the human memory system, EverMemOS features a four-layer architecture analogous to key brain regions: an Agentic Layer (task planning, mirroring the prefrontal cortex), a Memory Layer (long-term storage, like cortical networks), an Index Layer (associative retrieval, drawing from the hippocampus), and an API/MCP Interface Layer (external integration, serving as AI's "sensory interface").

The system delivers breakthroughs in both scenario coverage and technical performance. It is the first memory system capable of supporting both 1-on-1 conversation use cases and complex multi-agent enterprise collaboration. On technical benchmarks, EverMemOS achieved 92.3% accuracy on LoCoMo (a long-context memory evaluation) and 82% on LongMemEval-S (a suite for assessing long-term memory retention), significantly surpassing prior state-of-the-art results and setting a new industry standard.

The open-source version of EverMemOS is now available on GitHub, with a cloud service version to be launched late this year. The dual-track model, combining open collaboration with managed cloud services, aims to drive industry-wide evolution in long-term memory technology, inviting developers, enterprises, and researchers to contribute to and benefit from the system.

About EverMind

EverMind is redefining the future of AI by solving one of its most fundamental limitations: long-term memory. Its flagship platform, EverMemOS, introduces a breakthrough architecture for scalable and customizable memory systems, enabling AI to operate with extended context, maintain behavioral consistency, and improve through continuous interaction.

To learn more about EverMind and EverMemOS, please visit:

Website: https://evermind.ai/
GitHub: https://github.com/EverMind-AI/EverMemOS
X: https://x.com/EverMindAI
Reddit: https://www.reddit.com/r/EverMindAI/ 

** The press release content is from PR Newswire. Bastille Post is not involved in its creation. **

AI Infrastructure Company EverMind Released EverMemOS, Responding to Profound Challenges in AI

AI Infrastructure Company EverMind Released EverMemOS, Responding to Profound Challenges in AI

Recommended Articles