Skip to Content Facebook Feature Image

Dnotitia Unveils STAR-KV, Achieving UP to 20x KV Cache Compression, Selected as an ICML 2026 Spotlight Paper

Business

Dnotitia Unveils STAR-KV, Achieving UP to 20x KV Cache Compression, Selected as an ICML 2026 Spotlight Paper
Business

Business

Dnotitia Unveils STAR-KV, Achieving UP to 20x KV Cache Compression, Selected as an ICML 2026 Spotlight Paper

2026-07-02 07:30 Last Updated At:07:45

  • Introduces a low-rank-based approach to KV cache compression, one of the key bottlenecks in long-context AI
  • Speeds up attention computation by up to 6.9x and overall generation throughput by up to 3.1x, moving beyond memory savings to faster inference
  • Selected as a Spotlight paper at ICML 2026, representing about 2.2% of reviewed submissions and about 8.4% of accepted papers
  • Following the attention around Google's TurboQuant at ICLR 2026, STAR-KV presents another approach to advancing KV cache compression
  • Paper available on arXiv; source code released on GitHub
  • SEOUL, South Korea, July 2, 2026 /PRNewswire/ -- Dnotitia Inc. (Dnotitia), a company specializing in long-term memory AI and semiconductor-based AI infrastructure technologies, has released the paper and source code for "STAR-KV: Low-Rank KV Cache Compression via Soft Thresholding for Adaptive Rank Control." The technology was developed through a joint research effort involving UC San Diego's VVIP Lab and Dnotitia researchers, and the paper was selected as a Spotlight paper at ICML 2026 (International Conference on Machine Learning 2026), one of the world's leading conferences in machine learning.

    In the experiments reported in the paper, low-rank compression alone reduced the KV cache by up to 75%. Combined with the mixed-precision quantization method proposed in the paper, STAR-KV compressed the full KV cache by up to 20x. The technology also improves computation speed through custom GPU kernels, increasing attention computation speed by up to 6.9x and overall generation throughput by up to 3.1x. STAR-KV also showed higher accuracy than major existing KV cache compression methods.

    KV cache compression has become a key technical challenge in AI infrastructure. As research into reducing the memory bottleneck of long-context AI gains momentum, including the attention around Google's TurboQuant at ICLR 2026, STAR-KV presents a new approach that combines low-rank compression with quantization and GPU execution optimization.

    The KV cache is temporary memory stored on the GPU so that a large language model (LLM) does not have to recompute context it has already processed. As AI evolves into agentic systems that use multiple documents, conversation history, code, search results, and outputs from external tools, the amount of context a model must process is growing rapidly. In this environment, the KV cache has emerged as a key bottleneck affecting both GPU memory usage and inference cost.

    According to the STAR-KV paper, when a LLaMA-3.1-8B model processes a 128K-token context at a batch size of 4, the KV cache accounts for about 81% of total GPU memory. As long-context AI becomes more widely used, KV cache compression is increasingly viewed as a core AI infrastructure technology for processing long context at lower cost.

    ICML, where the STAR-KV paper was accepted, is widely regarded as one of the top international conferences in AI and machine learning, alongside NeurIPS and ICLR. ICML 2026 will be held from July 6 to 11 at COEX in Seoul. This year, 23,918 papers entered review, 6,352 were accepted, and 536 were selected as Spotlight papers. Spotlight papers account for about 2.2% of all reviewed submissions and about 8.4% of accepted papers.

    Going forward, Dnotitia plans to further advance STAR-KV for use in real-world AI service environments and explore its application to open-source LLM inference frameworks such as vLLM.

    "Technologies that help AI process longer context faster and at lower cost are advancing rapidly" said MK Chung, CEO of Dnotitia. "STAR-KV addresses the core bottlenecks in KV cache capacity and attention processing speed, and Dnotitia aims to contribute to the AI inference ecosystem through open sourcing."

SEOUL, South Korea, July 2, 2026 /PRNewswire/ -- Dnotitia Inc. (Dnotitia), a company specializing in long-term memory AI and semiconductor-based AI infrastructure technologies, has released the paper and source code for "STAR-KV: Low-Rank KV Cache Compression via Soft Thresholding for Adaptive Rank Control." The technology was developed through a joint research effort involving UC San Diego's VVIP Lab and Dnotitia researchers, and the paper was selected as a Spotlight paper at ICML 2026 (International Conference on Machine Learning 2026), one of the world's leading conferences in machine learning.

In the experiments reported in the paper, low-rank compression alone reduced the KV cache by up to 75%. Combined with the mixed-precision quantization method proposed in the paper, STAR-KV compressed the full KV cache by up to 20x. The technology also improves computation speed through custom GPU kernels, increasing attention computation speed by up to 6.9x and overall generation throughput by up to 3.1x. STAR-KV also showed higher accuracy than major existing KV cache compression methods.

KV cache compression has become a key technical challenge in AI infrastructure. As research into reducing the memory bottleneck of long-context AI gains momentum, including the attention around Google's TurboQuant at ICLR 2026, STAR-KV presents a new approach that combines low-rank compression with quantization and GPU execution optimization.

The KV cache is temporary memory stored on the GPU so that a large language model (LLM) does not have to recompute context it has already processed. As AI evolves into agentic systems that use multiple documents, conversation history, code, search results, and outputs from external tools, the amount of context a model must process is growing rapidly. In this environment, the KV cache has emerged as a key bottleneck affecting both GPU memory usage and inference cost.

According to the STAR-KV paper, when a LLaMA-3.1-8B model processes a 128K-token context at a batch size of 4, the KV cache accounts for about 81% of total GPU memory. As long-context AI becomes more widely used, KV cache compression is increasingly viewed as a core AI infrastructure technology for processing long context at lower cost.

ICML, where the STAR-KV paper was accepted, is widely regarded as one of the top international conferences in AI and machine learning, alongside NeurIPS and ICLR. ICML 2026 will be held from July 6 to 11 at COEX in Seoul. This year, 23,918 papers entered review, 6,352 were accepted, and 536 were selected as Spotlight papers. Spotlight papers account for about 2.2% of all reviewed submissions and about 8.4% of accepted papers.

Going forward, Dnotitia plans to further advance STAR-KV for use in real-world AI service environments and explore its application to open-source LLM inference frameworks such as vLLM.

"Technologies that help AI process longer context faster and at lower cost are advancing rapidly" said MK Chung, CEO of Dnotitia. "STAR-KV addresses the core bottlenecks in KV cache capacity and attention processing speed, and Dnotitia aims to contribute to the AI inference ecosystem through open sourcing."

** This press release is distributed by PR Newswire through automated distribution system, for which the client assumes full responsibility. **

Dnotitia Unveils STAR-KV, Achieving UP to 20x KV Cache Compression, Selected as an ICML 2026 Spotlight Paper

Dnotitia Unveils STAR-KV, Achieving UP to 20x KV Cache Compression, Selected as an ICML 2026 Spotlight Paper

SUZHOU, China, July 2, 2026 /PRNewswire/ -- VEICHI (www.veichi.com), a global provider of industrial automation and renewable energy solutions, recently successfully unveiled its commercial and industrial (C&I) solar systems to help businesses mitigate power fluctuation risks and optimize energy costs as the global energy transition accelerates.

Leveraging its expertise in power electronics, electric drives, and industrial control, VEICHI has developed a comprehensive C&I energy storage portfolio covering hybrid inverteroff grid inverter, microgrid inverter and battery energy storage system (BESS). 

Building a Complete Energy Ecosystem

"VEICHI's latest C&I energy storage solutions represent an important milestone in our renewable energy strategy," said Shylock Fan, Director of VEICHI Renewable Energy. "Building on our full-scenario experience in residential energy storage inverters, battery systems, and related applications, VEICHI has developed a complete energy ecosystem spanning smart homes, commercial facilities, and industrial applications. By integrating industrial automation capabilities with renewable energy technologies, VEICHI is committed to delivering reliable, end-to-end green energy solutions for partners worldwide."

Introducing the C&I Microgrid Energy Storage Solution

At the core of VEICHI's next-generation C&I hybrid inverter and microgrid solution is the VPS Hybrid Inverter, engineered to deliver higher power density, greater system flexibility, and enhanced reliability for demanding commercial and industrial environments. The product supports both grid-connected and off-grid operation modes, as well as multi-unit parallel expansion.

Industrial-Grade Stability and Impact-Resistant Design: Engineered for complex industrial loads, the VPS Series provides strong impact resistance for inductive-load applications. It supports bypass-based maintenance without system shutdown and incorporates robust hardware protection to ensure stable performance in harsh operating environments.

Seamless Switching and Microgrid Support: Equipped with highly integrated Static Transfer Switch (STS) technology, the system enables seamless millisecond-level switching during grid outages. With a built-in isolation transformer, it provides stable voltage support during independent microgrid operation and three-phase imbalance conditions, helping users address grid fluctuations and maintain operational continuity.

Efficient Energy Management and Intelligent Dispatch: An advanced DC-coupled design increases PV utilization by up to 2%. The system integrates anti-backflow control, grid-forming capability, Virtual Synchronous Generator (VSG) and black-start capabilities, together with a built-in Energy Management System (EMS) for intelligent energy dispatch. This helps maximize solar self-consumption, reduce diesel generator starts, and improve overall energy utilization efficiency.

From Core Technologies to Global Applications

VEICHI's new energy solutions have been deployed across key markets including Southeast Asia, South Asia, the Middle East, Central Asia, and Africa. Its products have been showcased at international power and energy exhibitions in countries including the Philippines, Thailand, Vietnam, Myanmar, Pakistan, Iraq, Lebanon, Saudi Arabia, Türkiye, Uzbekistan, Egypt, and Nigeria, with expansion into surrounding markets and regions.

Drawing on decades of expertise in electric drives and industrial automation technologies, VEICHI applies stringent manufacturing and quality control standards throughout the product lifecycle. Looking ahead, the company will continue to expand localized technical support and service capabilities in overseas markets, providing faster response and professional end-to-end solutions to help customers accelerate clean energy adoption, improve operational continuity, and achieve long-term sustainability goals.

About VEICHI

VEICHI Electric (stock code: 688698) has long been committed to industrial automation and renewable energy technologies. With industrial automation as its foundation and green energy as a strategic growth focus, the company provides comprehensive solutions covering solar water pump systems, off-grid inverters, hybrid inverters, residential solar systems, industrial-grade BESS systems and hydrogen production. Through continuous innovation, VEICHI (www.veichi.com) is dedicated to advancing the efficient generation, storage, and management of renewable energy worldwide.

For more details, please visit the official website at www.veichienergy.com

Follow on social media:
Facebook: www.facebook.com/VEICHIESS
LinkedIn: www.linkedin.com/company/veichi-ess
Instagram: www.instagram.com/veichielectric
YouTube: www.youtube.com/@VeichiElectric

** This press release is distributed by PR Newswire through automated distribution system, for which the client assumes full responsibility. **

VEICHI Launches C&I Energy Storage and Microgrid Solutions

VEICHI Launches C&I Energy Storage and Microgrid Solutions

VEICHI Launches C&I Energy Storage and Microgrid Solutions

VEICHI Launches C&I Energy Storage and Microgrid Solutions

Recommended Articles