AI 速递 2026-07-01
生成时间:2026/7/1 10:11:31(UTC: 2026-07-01T02:11:31.610Z)
Agentic Abstention: Do Agents Know When to Stop Instead of Act?
Section titled “Agentic Abstention: Do Agents Know When to Stop Instead of Act?”👍 121 · arXiv
LLM agents are expected to act over multiple turns, using search, browsing interfaces, and terminal tools to complete user goals. Yet not every goal is well specified or achievable in the available en…
LiveEdit: Towards Real-Time Diffusion-Based Streaming Video Editing
Section titled “LiveEdit: Towards Real-Time Diffusion-Based Streaming Video Editing”👍 72 · arXiv
Streaming video editing has made rapid progress, yet practical deployment is still limited by two core issues: maintaining stable backgrounds and non-edited regions over time, and achieving the low la…
Scaling the Horizon, Not the Parameters: Reaching Trillion-Parameter Performance with a 35B Agent
Section titled “Scaling the Horizon, Not the Parameters: Reaching Trillion-Parameter Performance with a 35B Agent”👍 67 · arXiv
We introduce Agents-A1, a 35B Mixture-of-Experts Agentic Model that reaches trillion-parameter-level performance by scaling the agent horizon. We investigate agent-horizon scaling from two perspective…
TUA-Bench: A Benchmark for General-Purpose Terminal-Use Agents
Section titled “TUA-Bench: A Benchmark for General-Purpose Terminal-Use Agents”👍 44 · arXiv
As large language models and harness frameworks continue to advance, agents operating in terminals are increasingly capable of performing a broader range of general computer-use tasks beyond coding. H…
ReFreeKV: Towards Threshold-Free KV Cache Compression
Section titled “ReFreeKV: Towards Threshold-Free KV Cache Compression”👍 43 · arXiv
To reduce memory consumption during LLM inference, a handful of methods have been proposed for KV cache pruning. While these techniques can accomplish lossless memory reduction on many datasets, they …
OpenClaw v2026.6.11
Section titled “OpenClaw v2026.6.11”We heard the feedback. v2026.6.11 focuses on the rough edges that make OpenClaw feel less dependable, with fixes for misplaced replies, stuck sends, reconnects, model setup failures, and safer admin d…
链接:https://github.com/openclaw/openclaw/releases/tag/v2026.6.11
LangChain langchain-openrouter==0.2.5
Section titled “LangChain langchain-openrouter==0.2.5”Changes since langchain-openrouter==0.2.4
release(openrouter): 0.2.5 (#38553) fix(openrouter): deduplicate repeated finish metadata (#38552) fix(openrouter): strip Responses reasoning IDs (#38383)…
链接:https://github.com/langchain-ai/langchain/releases/tag/langchain-openrouter%3D%3D0.2.5
vLLM v0.24.0
Section titled “vLLM v0.24.0”vLLM v0.24.0 Release Notes
Section titled “vLLM v0.24.0 Release Notes”Highlights
Section titled “Highlights”This release features 571 commits from 256 contributors (77 new)!
- MiniMax-M3: Added support for the new MiniMax-M3 model (#45381), with a …
链接:https://github.com/vllm-project/vllm/releases/tag/v0.24.0
Ollama v0.31.1
Section titled “Ollama v0.31.1”Faster Gemma 4 on Apple Silicon
Section titled “Faster Gemma 4 on Apple Silicon”链接:https://github.com/ollama/ollama/releases/tag/v0.31.1
CrewAI 1.15.2a1
Section titled “CrewAI 1.15.2a1”What’s Changed
Section titled “What’s Changed”Features
Section titled “Features”- Repoint template commands to crewAIInc-fde org
- Support inline skill definitions
- Define stream frame protocol for flows
- Add type tool and app in CrewDefinition -…
链接:https://github.com/crewAIInc/crewAI/releases/tag/1.15.2a1
OpenAI Codex CLI rust-v0.142.5
Section titled “OpenAI Codex CLI rust-v0.142.5”Bug Fixes
Section titled “Bug Fixes”- Prevented full Responses WebSocket request payloads from being written to trace logs. (#30771)
Changelog
Section titled “Changelog”Full Changelog: https://github.com/openai/codex/compare/rust-v0.142.4…ru…
链接:https://github.com/openai/codex/releases/tag/rust-v0.142.5
Wayve launches $85M employee tender offer at $8.5B valuation
Section titled “Wayve launches $85M employee tender offer at $8.5B valuation”Wayve’s offering is part of a growing trend of AI startups using employee tenders as a strategic tool to attract and retain talent.
OpenClaw is finally available on Android and iOS
Section titled “OpenClaw is finally available on Android and iOS”The free open source agentic program is finally invading your phone.
The DeepMind trio who built a poker AI are now making money for quant hedge funds
Section titled “The DeepMind trio who built a poker AI are now making money for quant hedge funds”EquiLibre Technologies, a Prague-based AI lab founded by three ex-DeepMind researchers, is now valued at more than $500 million.
Google introduces a faster, cheaper image generator with Nano Banana 2 Lite
Section titled “Google introduces a faster, cheaper image generator with Nano Banana 2 Lite”Google is updating its image generator to make it faster and cheaper, making it a more useful tool for creators looking to make AI content.
Nvidia competitor Etched hits $5B valuation, $1B in sales for AI chip
Section titled “Nvidia competitor Etched hits $5B valuation, $1B in sales for AI chip”Nvidia AI chip competitor Etched says it has already booked $1 billion under contract for the inference systems powered by its chip.
Anthropic launches Claude Sonnet 5 as a cheaper way to run agents
Section titled “Anthropic launches Claude Sonnet 5 as a cheaper way to run agents”Anthropic’s Claude Sonnet 5 brings stronger agentic capabilities, lower pricing, and improved safety, positioning the model as a cheaper alternative to Opus, GPT-5.5, and Gemini Pro.
Acti puts AI agents directly into your smartphone keyboard
Section titled “Acti puts AI agents directly into your smartphone keyboard”Acti is betting the smartphone keyboard is the next home for AI assistants. The startup’s new keyboard for iOS and Android works across apps and lets users create custom AI-powered shortcuts using natural language.
Anthropic’s Claude Science bets on workflow, not a new model, to win over scientists
Section titled “Anthropic’s Claude Science bets on workflow, not a new model, to win over scientists”Anthropic’s Claude Science is a workbench that gives scientists one environment to do computational research, saving them from the need to bounce between databases, pipelines, and tools.