周报 2026-05-25 ~ 2026-05-31
生成时间:2026/5/31 13:20:58(UTC: 2026-05-31T05:20:58.596Z)
本周自动总结未启用或调用失败,以下为原始内容合并。
2026-05-25
Section titled “2026-05-25”生成时间:2026/5/25 10:09:43(UTC: 2026-05-25T02:09:43.504Z)
数据来源:Trendshift · GitHub Trending
⭐ 趋势飙升 · Python
基于 DeepSeek 的原生高缓存低成本编码 Agent 框架。工程视角来看,该项目巧妙利用了 DeepSeek 的上下文缓存机制,在保证代码生成质量的同时大幅降低了 API 调用成本,非常适合需要高频交互的开发助手场景。
⭐ 趋势飙升 · TypeScript
支持多渠道接入的开源个人助理 Agent 框架。该项目在最新版本中优化了网关性能并引入了基于表情的快捷权限审批机制,为构建高并发、重交互的企业级 Agent 网关提供了优秀的架构参考。
⭐ 趋势飙升 · Python
为 Datasette 数据探索工具量身定制的可扩展 AI 助手插件。从工程角度而言,它展示了如何通过清晰的插件机制将 LLM 的自然语言转 SQL 能力无缝嵌入到现有的数据基础设施中,适合轻量级数据分析与 BI 场景。
DelTA:基于可验证奖励的强化学习判别式Token信用分配
Section titled “DelTA:基于可验证奖励的强化学习判别式Token信用分配”👍 192 · arXiv
本文提出了一种从判别器视角理解可验证奖励强化学习(RLVR)更新的新方法,揭示了响应级别的奖励如何转化为 Token 级别的概率变化。这对优化大语言模型的推理能力具有重要指导意义,工程上可借鉴其信用分配机制来提升 RLHF/RLAIF 的训练稳定性和效率。
π-Bench:评估长周期工作流中的主动式个人助理Agent
Section titled “π-Bench:评估长周期工作流中的主动式个人助理Agent”👍 91 · arXiv
该基准测试针对当前 Agent 在处理模糊需求和长周期任务时缺乏主动性的问题,提供了一套全新的评估框架。对于 Agent 开发者而言,这指明了从被动响应向主动澄清、约束推理演进的架构优化方向,有助于提升真实业务场景下的任务完成率。
全注意力反击:百步训练内将全注意力转化为稀疏注意力
Section titled “全注意力反击:百步训练内将全注意力转化为稀疏注意力”👍 85 · arXiv
研究表明,全注意力大模型本质上已具备稀疏性,仅需极少量的训练步骤即可无损转换为稀疏注意力模式。这一发现为长上下文推理的性能瓶颈提供了极具成本效益的解决方案,非常适合需要优化显存占用和推理延迟的基础设施团队。
LatentOmni:通过统一视听潜在推理重构全模态理解
Section titled “LatentOmni:通过统一视听潜在推理重构全模态理解”👍 40 · arXiv
针对当前多模态大模型在显式文本思维链(CoT)中容易丢失细粒度时空信息的问题,本文提出在连续的视听潜在空间中进行联合推理。这为构建原生多模态 Agent 提供了新的架构思路,有助于提升工具调用中对复杂音视频流的处理精度。
- OpenClaw v2026.5.24-beta.2:本次更新主要优化了 iMessage 渠道的交互体验,支持通过点赞(👍/👎)表情直接进行权限审批(allow-once/deny)。同时在网关性能方面,通过复用通道目录读取和避免重复的边界检查,显著降低了 CPU 负载。 Release 链接
-
Cursor Composer 2.5:Composer 2.5 版本正式上线,进一步提升了多文件编辑和代码生成的连贯性与准确度。 Release 链接
-
Cursor Automations 更新:自动化功能现已集成至 Agents 窗口,并支持配置关联多个代码仓库或无仓库运行,大幅增强了 Agent 处理跨项目任务的灵活性。 Release 链接
-
Cursor Cloud Agents 开发环境:为云端 Agent 引入了完整的开发环境支持,包括克隆仓库、安装依赖、内部工具链凭证及构建系统访问权限,使其能够像本地环境一样端到端完成工程任务。 Release 链接
-
Cursor PR Review & 并行 Agent:引入了全新的 PR 审查体验,支持通过并行 Agent 加快执行速度,并新增了针对常见工作流的快捷操作按钮。 Release 链接
DeepSeek 旗舰模型永久降价 75% DeepSeek 宣布对其旗舰 AI 模型进行高达 75% 的永久性降价。这一激进的定价策略将大幅降低 AI Agent 和复杂 RAG 系统的推理成本,可能引发新一轮的 API 价格战,直接影响开发者的多模型路由选型。 原文链接
约束衰减:LLM Agent 在后端代码生成中的脆弱性 最新研究揭示了 LLM Agent 在处理复杂后端代码生成时存在的“约束衰减”现象,即随着上下文增加,模型会逐渐遗忘初始设定的架构约束。这提醒工程团队在设计长周期 Agent 工作流时,必须引入显式的约束校验和状态管理机制。 原文链接
Claude 不是你的架构师:正视 AI 的能力边界 一篇引发广泛讨论的深度博文,指出开发者不应将系统架构设计完全外包给 Claude 等大模型。文章强调了 AI 在全局系统思维和深层业务逻辑理解上的局限性,建议工程师将 AI 定位为“高级实现工具”而非“决策者”。 原文链接
连续批处理中的异步解锁(Hugging Face) Hugging Face 官方博客深入探讨了在 LLM 推理引擎中实现连续批处理(Continuous Batching)异步化的技术细节。这项底层优化能够有效提升 GPU 利用率和吞吐量,对自建推理基础设施的团队具有极高的参考价值。 原文链接
vLLM V0 到 V1:强化学习推理中的正确性优先 vLLM 团队分享了从 V0 到 V1 版本演进过程中,针对强化学习(RL)推理场景的架构重构经验。文章重点讨论了在分布式推理中如何保证状态一致性和计算正确性,为构建支持 RLHF 的高性能训练/推理集群提供了最佳实践。 原文链接
微软报告:AI 成本已超过支付人类员工的薪酬 微软内部数据揭示,在某些复杂任务中,由于海量 Token 消耗和多 Agent 协作的开销,AI 的运行成本已经超过了直接雇佣人类。这凸显了在企业级应用中,优化 Prompt 效率、引入小模型路由以及控制 Agent 循环迭代次数的工程紧迫性。 原文链接
Datasette Agent 发布:为数据探索引入可扩展 AI 助手 知名开源开发者 Simon Willison 发布了 Datasette Agent 的首个版本。该工具利用 LLM 实现了对 SQLite 数据库的自然语言查询与数据分析,展示了如何通过插件机制将 Agent 能力无缝集成到现有数据基础设施中。 原文链接
2026-05-26
Section titled “2026-05-26”生成时间:2026/5/26 10:00:32(UTC: 2026-05-26T02:00:32.408Z)
SkillOpt:自进化 Agent 技能的执行策略
Section titled “SkillOpt:自进化 Agent 技能的执行策略”👍 159 · arXiv
现有 Agent 技能多为手工制作或单次生成,缺乏类似深度学习优化器的可靠反馈改进机制。本文提出将技能作为冻结 Agent 的外部状态进行训练,引入严格的优化策略。这对构建具备自我迭代和进化能力的复杂 Agent 架构具有重要的工程启发。
SciAtlas:面向自动化科学研究的大规模知识图谱
Section titled “SciAtlas:面向自动化科学研究的大规模知识图谱”👍 47 · arXiv
针对当前学术检索工具依赖浅层关键词或向量检索、缺乏拓扑推理能力的问题,本文构建了一个大规模知识图谱。该研究为基于 RAG 的科研 Agent 提供了结构化知识组织的范式,有助于提升复杂跨学科信息的检索与推理准确度。
StepAudio 2.5 技术报告
Section titled “StepAudio 2.5 技术报告”👍 37 · arXiv
统一的音频-语言模型致力于将大语言模型的推理能力引入语音任务,但现有模型在 ASR、TTS 和实时交互方面往往难以媲美专用系统。本报告详细介绍了 StepAudio 2.5 的架构设计,为开发具备高质量实时语音交互能力的多模态 Agent 提供了工程参考。
Lens:重新思考基础文生图模型的训练效率
Section titled “Lens:重新思考基础文生图模型的训练效率”👍 90 · arXiv
本文推出了 3.8B 参数的文生图模型 Lens,其性能媲美甚至超越了 6B 参数的 SOTA 模型,但仅需约 19.3% 的训练算力。这为多模态生成模型的降本增效和高效训练架构设计提供了极具价值的实践经验。
- OpenClaw v2026.5.24-beta.2:新增对 iMessage 表情回复(点赞/踩)的解析支持,分别映射为单次允许或拒绝操作。同时优化了网关性能,复用进程稳定的通道目录读取以避免重复的边界检查。Release 链接
- Cursor Composer 2.5:Composer 2.5 正式上线,进一步提升了 AI 辅助编码的上下文理解与代码生成体验。Release 链接
- Cursor Cloud Agents Dev Environments:为云端 Agent 引入了完整的开发环境支持,包括克隆仓库、安装依赖、内部工具链凭证及访问构建系统,使其能够端到端完成工程任务。Release 链接
- Cursor Automations Improvements:Agent 窗口新增 Cursor Automations 功能,支持配置关联多个代码仓库或无仓库的自动化任务。Release 链接
- Cursor Parallel Agents & PR Review:引入全新的 PR 审查体验,支持通过并行 Agent 更快地执行构建计划,并新增了常见工作流的快捷操作。Release 链接
DeepSeek 旗舰 AI 模型永久降价 75% DeepSeek 宣布对其旗舰模型实施 75% 的永久降价。这一激进的定价策略将大幅降低开发者调用 API 的成本,直接影响企业级 AI 应用的 ROI 评估与多模型路由选型。原文链接
内存成本已占 AI 芯片组件成本的近三分之二 Epoch AI 数据显示,内存在 AI 芯片组件中的成本占比已飙升至近 66%。这揭示了当前大模型推理与训练面临的核心硬件瓶颈,对未来算力集群的架构设计和成本控制具有重要指导意义。原文链接
解锁连续批处理(Continuous Batching)中的异步机制 Hugging Face 深入探讨了在 LLM 推理的连续批处理过程中引入异步机制的技术细节。该方案能有效提升 GPU 利用率和吞吐量,是优化高并发推理服务基础设施的关键参考。原文链接
开放 Agent 排行榜(The Open Agent Leaderboard)发布 Hugging Face 联合 IBM Research 推出了 Open Agent Leaderboard,旨在标准化开源 Agent 的评估体系。这为开发者在选择和对比不同 Agent 框架及底层模型时提供了量化的基准支持。原文链接
厘清 AI Agent 核心术语:Harness 与 Scaffold Hugging Face 博客撰文梳理了 AI Agent 领域的关键工程术语,重点辨析了测试工具(Harness)与脚手架(Scaffold)的概念边界。这有助于统一开发者在构建复杂 Agent 系统时的架构语言。原文链接
Claude 成功发现 Apple macOS 26.5 内核漏洞 安全研究人员利用 Claude 发现了 macOS 内核的高危漏洞 (CVE-2026-28952)。这标志着大模型在复杂系统级代码审计和自动化安全漏洞挖掘场景中的工程化应用达到了新高度。原文链接
观点:Claude 不是你的架构师,停止让它越俎代庖 一篇引发热议的工程博文指出,尽管 LLM 在编码辅助上表现优异,但开发者不应将其视为系统架构师。文章强调了在 AI 时代保持人类工程师在系统设计、边界划分和技术选型上主导权的重要性。原文链接
2026-05-27
Section titled “2026-05-27”生成时间:2026/5/27 10:09:05(UTC: 2026-05-27T02:09:05.024Z)
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation
Section titled “WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation”👍 88 · arXiv
Interactive world models are advancing rapidly, yet existing benchmarks cover only part of the required competencies, leaving no unified standard for systematic evaluation. To fill this gap, we introd…
Foundation Protocol: A Coordination Layer for Agentic Society
Section titled “Foundation Protocol: A Coordination Layer for Agentic Society”👍 59 · arXiv
Autonomous agents are moving from tools into a layer of social infrastructure: they browse, purchase, deploy software, manage systems, and increasingly interact with one another. As these systems scal…
TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction
Section titled “TriSplat: Simulation-Ready Feed-Forward 3D Scene Reconstruction”👍 34 · arXiv
Sparse-view 3D reconstruction is increasingly addressed with feed-forward splatting networks that predict explicit primitives directly from images. Yet most existing methods remain centered on Gaussia…
Toward Native Multimodal Modeling: A Roadmap
Section titled “Toward Native Multimodal Modeling: A Roadmap”👍 31 · arXiv
Multimodal modeling represents a vital step from modality-agnostic reasoning toward world modeling. While early approaches predominantly rely on late-fusion that assembles encoders and frozen language…
ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning
Section titled “ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning”👍 29 · arXiv
Training large multimodal models (LMMs) via reinforcement learning (RL) to natively invoke video-processing tools (e.g., cropping) has become a promising route to long-video understanding. However, ex…
OpenClaw v2026.5.26-beta.1
Section titled “OpenClaw v2026.5.26-beta.1”2026.5.26
Section titled “2026.5.26”Highlights
Section titled “Highlights”- Faster replies and startup: visible reply delivery now separates user-facing sends from slower follow-up work, command/model/plugin metadata is reused on hot paths, and…
链接:https://github.com/openclaw/openclaw/releases/tag/v2026.5.26-beta.1
LangChain langchain-perplexity==1.3.0
Section titled “LangChain langchain-perplexity==1.3.0”Changes since langchain-perplexity==1.2.0
release(perplexity): 1.3.0 (#37707)
feat(perplexity): use_responses_api flag on ChatPerplexity (#37359)
chore(infra): bump langchain-tests floor to 1.1…
链接:https://github.com/langchain-ai/langchain/releases/tag/langchain-perplexity%3D%3D1.3.0
OpenAI Codex CLI rust-v0.134.0
Section titled “OpenAI Codex CLI rust-v0.134.0”New Features
Section titled “New Features”- Added search across local conversation history, including case-insensitive content matches with result previews. (#23519, #23921)
- Made
--profilethe primary profile selector acro…
链接:https://github.com/openai/codex/releases/tag/rust-v0.134.0
Outsourcing plus local AI will soon become more economical vs. frontier labs
Section titled “Outsourcing plus local AI will soon become more economical vs. frontier labs”Article URL: https://www.signalbloom.ai/posts/outsourcing-plus-localai-will-soon-become-more-economical-vs-frontier-labs/ Comments URL: https://news.ycombinator.com/item?id=48278610 Points: 250
Comments: 272
Section titled “Comments: 272”The AI bubble isn’t like the internet bubble
Section titled “The AI bubble isn’t like the internet bubble”Article URL: https://pluralistic.net/2026/05/26/the-ai-will-continue/#until-morale-improves Comments URL: https://news.ycombinator.com/item?id=48277784 Points: 70
Comments: 87
Section titled “Comments: 87”Uber president says AI spending is getting ‘harder to justify’
Section titled “Uber president says AI spending is getting ‘harder to justify’”Article URL: https://www.theverge.com/transportation/937116/uber-ai-investment-hard-to-justify Comments URL: https://news.ycombinator.com/item?id=48277485 Points: 262
Comments: 134
Section titled “Comments: 134”Notes on Pope Leo XIV’s Encyclical on AI
Section titled “Notes on Pope Leo XIV’s Encyclical on AI”Article URL: https://simonwillison.net/2026/May/25/encyclical-on-ai/ Comments URL: https://news.ycombinator.com/item?id=48275098 Points: 61
Comments: 12
Section titled “Comments: 12”CVE-2026-28952: Apple macOS 26.5 Kernel Vuln found by Claude
Section titled “CVE-2026-28952: Apple macOS 26.5 Kernel Vuln found by Claude”Article URL: https://support.apple.com/en-us/127115 Comments URL: https://news.ycombinator.com/item?id=48273169 Points: 166
Comments: 98
Section titled “Comments: 98”Using AI to write better code more slowly
Section titled “Using AI to write better code more slowly”Article URL: https://nolanlawson.com/2026/05/25/using-ai-to-write-better-code-more-slowly/ Comments URL: https://news.ycombinator.com/item?id=48272984 Points: 1142
Comments: 418
Section titled “Comments: 418”Norway’s 2 petabytes of Huawei flash storage and LLM training
Section titled “Norway’s 2 petabytes of Huawei flash storage and LLM training”Article URL: https://www.blocksandfiles.com/flash/2026/05/22/norways-2-petabytes-of-huawei-flash-storage-and-llm-training/5244910 Comments URL: https://news.ycombinator.com/item?id=48270770 Points: 320
Comments: 202
Section titled “Comments: 202”Pope Leo XIV says AI must serve humanity, not the powerful few
Section titled “Pope Leo XIV says AI must serve humanity, not the powerful few”Article URL: https://religionnews.com/2026/05/25/in-his-first-encyclical-pope-leo-xiv-says-ai-must-serve-humanity-not-the-powerful-few/ Comments URL: https://news.ycombinator.com/item?id=48266485 Points: 344
Comments: 67
Section titled “Comments: 67”2026-05-28
Section titled “2026-05-28”生成时间:2026/5/28 09:52:02(UTC: 2026-05-28T01:52:02.313Z)
LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding
Section titled “LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding”👍 93 · arXiv
Vision-language models (VLMs) commonly formulate visual grounding and detection as a coordinate-token generation problem, serializing each 2D box into multiple 1D tokens that are learned and decoded l…
EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation
Section titled “EvalVerse: Pipeline-Aware and Expert-Calibrated Benchmarking for Professional Cinematic Video Generation”👍 72 · arXiv
The rapid evolution of generative video foundation models has propelled the field toward professional-grade cinematic synthesis. To achieve such demanding quality, the community transitions towards Re…
SpatialBench: Is Your Spatial Foundation Model an All-Round Player?
Section titled “SpatialBench: Is Your Spatial Foundation Model an All-Round Player?”👍 57 · arXiv
While spatial foundation models have demonstrated impressive performance on standard datasets, a critical question remains: are they truly all-round players capable of generalizing robustly across div…
MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research
Section titled “MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research”👍 51 · arXiv
We present MobileGym, a browser-hosted, lightweight, fully controllable environment for everyday mobile use, targeting interaction fidelity without replicating proprietary backends. It enables two cap…
Geometry-Aware Representation Denoising for Robust Multi-view 3D Reconstruction
Section titled “Geometry-Aware Representation Denoising for Robust Multi-view 3D Reconstruction”👍 34 · arXiv
Multi-view 3D reconstruction has achieved remarkable progress with the advent of feed-forward 3D reconstruction models. However, these models are typically trained and evaluated under ideal, degradati…
OpenClaw v2026.5.26
Section titled “OpenClaw v2026.5.26”Highlights
Section titled “Highlights”- Faster Gateway and replies: startup avoids repeated plugin, channel, session, usage-cost, warning, scheduled-service, and filesystem scans; visible replies separate user-facing send…
链接:https://github.com/openclaw/openclaw/releases/tag/v2026.5.26
LangChain langchain-perplexity==1.3.1
Section titled “LangChain langchain-perplexity==1.3.1”Changes since langchain-perplexity==1.3.0
release(perplexity): 1.3.1 (#37720)
chore(perplexity): bump perplexityai to 0.34.1 (#37710)…
链接:https://github.com/langchain-ai/langchain/releases/tag/langchain-perplexity%3D%3D1.3.1
CrewAI 1.14.6a2
Section titled “CrewAI 1.14.6a2”What’s Changed
Section titled “What’s Changed”Features
Section titled “Features”- Enhance
StdioTransportto prevent environment variable leakage - Enhance planning configuration and observation handling
- Declare
env_varson `DatabricksQueryToo…
链接:https://github.com/crewAIInc/crewAI/releases/tag/1.14.6a2
Goose v1.36.0
Section titled “Goose v1.36.0”✨ Features
Section titled “✨ Features”链接:https://github.com/aaif-goose/goose/releases/tag/v1.36.0
OpenAI Codex CLI rust-v0.135.0-alpha.2
Section titled “OpenAI Codex CLI rust-v0.135.0-alpha.2”Release 0.135.0-alpha.2
…
链接:https://github.com/openai/codex/releases/tag/rust-v0.135.0-alpha.2
YouTube to automatically label AI-generated videos
Section titled “YouTube to automatically label AI-generated videos”https://variety.com/2026/digital/news/youtube-ai-video-label…
Comments URL: https://news.ycombinator.com/item?id=48299753 Points: 538
Comments: 323
Section titled “Comments: 323”DuckDuckGo search saw 28% more visits after Google said people love AI mode
Section titled “DuckDuckGo search saw 28% more visits after Google said people love AI mode”Article URL: https://www.pcgamer.com/hardware/duckduckgos-ai-free-search-saw-nearly-28-percent-more-visits-in-the-week-following-googles-insistence-that-people-love-ai-mode/ Comments URL: https://news.ycombinator.com/item?id=48296649 Points: 671
Comments: 334
Section titled “Comments: 334”Training our own AI models
Section titled “Training our own AI models”Article URL: https://posthog.com/blog/training-ai-models Comments URL: https://news.ycombinator.com/item?id=48296359 Points: 194
Comments: 131
Section titled “Comments: 131”Tech CEOs are apparently suffering from AI psychosis
Section titled “Tech CEOs are apparently suffering from AI psychosis”Article URL: https://techcrunch.com/2026/05/27/tech-ceos-are-apparently-suffering-from-ai-psychosis/ Comments URL: https://news.ycombinator.com/item?id=48295679 Points: 570
Comments: 288
Section titled “Comments: 288”I’m Tired of Talking to AI
Section titled “I’m Tired of Talking to AI”Article URL: https://orchidfiles.com/im-tired-of-ai-generated-answers/ Comments URL: https://news.ycombinator.com/item?id=48292224 Points: 1842
Comments: 898
Section titled “Comments: 898”Claude Code as a Daily Driver: Claude.md, Skills, Subagents, Plugins, and MCPs
Section titled “Claude Code as a Daily Driver: Claude.md, Skills, Subagents, Plugins, and MCPs”Article URL: https://arps18.github.io/posts/claude-code-mastery/ Comments URL: https://news.ycombinator.com/item?id=48289950 Points: 374
Comments: 228
Section titled “Comments: 228”AI tools are only as good as your judgment
Section titled “AI tools are only as good as your judgment”Article URL: https://theaileverageweekly.com/posts/your-ai-tools-are-only-as-good-as-your-judgment-and-that-s-the-point.html Comments URL: https://news.ycombinator.com/item?id=48287649 Points: 81
Comments: 22
Section titled “Comments: 22”Bay Area mom out thousands after scammers use AI to mimic daughter’s voice
Section titled “Bay Area mom out thousands after scammers use AI to mimic daughter’s voice”Article URL: https://abc7news.com/post/bay-area-mom-thousands-scammers-use-ai-mimic-daughters-voice-fake-kidnapping-part-growing-trend/19154381/ Comments URL: https://news.ycombinator.com/item?id=48285484 Points: 54
Comments: 22
Section titled “Comments: 22”2026-05-29
Section titled “2026-05-29”生成时间:2026/5/29 10:00:45(UTC: 2026-05-29T02:00:45.745Z)
ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation
Section titled “ProRL: Effective Reinforcement Learning for Proactive Recommendation via Rectified Policy Gradient Estimation”👍 76 · arXiv
Proactive Recommender Systems (PRSs) aim to guide user preference shift toward target items by generating paths of intermediate recommendations. Reinforcement learning (RL) provides a principled frame…
DenoiseRL: Bootstrapping Reasoning Models to Recover from Noisy Prefixes
Section titled “DenoiseRL: Bootstrapping Reasoning Models to Recover from Noisy Prefixes”👍 39 · arXiv
Reinforcement learning has become a central paradigm for advancing reasoning in large language models, yet most existing methods still depend on stronger teacher models or heavily curated difficult da…
GEM: Generative Supervision Helps Embodied Intelligence
Section titled “GEM: Generative Supervision Helps Embodied Intelligence”👍 35 · arXiv
Embodied Vision-Language Models (VLMs) have demonstrated impressive performance and generalization in robotics, particularly within Vision-Language-Action frameworks. However, a significant gap remain…
MemTrace: Tracing and Attributing Errors in Large Language Model Memory Systems
Section titled “MemTrace: Tracing and Attributing Errors in Large Language Model Memory Systems”👍 33 · arXiv
Memory is essential for enabling large language models to support long-horizon reasoning, yet existing memory systems remain unreliable and difficult to debug. Tracing memory’s dynamic evolution is cr…
ScientistOne: Towards Human-Level Autonomous Research via Chain-of-Evidence
Section titled “ScientistOne: Towards Human-Level Autonomous Research via Chain-of-Evidence”👍 29 · arXiv
Autonomous research agents produce competitive solutions and professional-looking manuscripts, yet their outputs contain verifiability failures undetectable by surface-level evaluation: fabricated cit…
OpenClaw v2026.5.27
Section titled “OpenClaw v2026.5.27”Highlights
Section titled “Highlights”- Stronger security and content boundaries: group prompt text is kept out of the system prompt, repeated-dot hostnames are normalized, side-effecting command wrappers and unsafe Node …
链接:https://github.com/openclaw/openclaw/releases/tag/v2026.5.27
LangChain langchain-anthropic==1.4.4
Section titled “LangChain langchain-anthropic==1.4.4”Changes since langchain-anthropic==1.4.3
release(anthropic): 1.4.4 (#37757) fix(anthropic): normalize cross-provider tool-call IDs (#37756) test(anthropic): retry integration tests on transient failu…
链接:https://github.com/langchain-ai/langchain/releases/tag/langchain-anthropic%3D%3D1.4.4
CrewAI 1.14.6
Section titled “CrewAI 1.14.6”What’s Changed
Section titled “What’s Changed”Features
Section titled “Features”- Enhance StdioTransport to prevent environment variable leakage
- Enhance planning configuration and observation handling
- Declare env_vars on DatabricksQueryTool
- A…
链接:https://github.com/crewAIInc/crewAI/releases/tag/1.14.6
Goose v1.36.0
Section titled “Goose v1.36.0”✨ Features
Section titled “✨ Features”链接:https://github.com/aaif-goose/goose/releases/tag/v1.36.0
OpenAI Codex CLI rust-v0.135.0
Section titled “OpenAI Codex CLI rust-v0.135.0”New Features
Section titled “New Features”codex doctornow reports richer environment, Git, terminal, app-server, and thread inventory diagnostics for support cases. (#24261, #24311, #24305)/statusshows remote connec…
链接:https://github.com/openai/codex/releases/tag/rust-v0.135.0
Glean’s top line crosses $300M as AI budget-cutting becomes its major selling point
Section titled “Glean’s top line crosses $300M as AI budget-cutting becomes its major selling point”The enterprise AI search startup tripled its annual revenue even as tech giants entered the category.
The internet is being rebuilt for machines
Section titled “The internet is being rebuilt for machines”As AI agents move from experiments to production, AWS, Cloudflare, and others are redesigning cloud infrastructure for a future dominated by machine-generated internet traffic instead of human users.
Asana acquires no-code agent-builder StackAI
Section titled “Asana acquires no-code agent-builder StackAI”Asana will incorporate StackAI into its growing suite of AI workflow tools.
Anthropic raises $65 billion, nears $1T valuation ahead of IPO
Section titled “Anthropic raises $65 billion, nears $1T valuation ahead of IPO”Anthropic has closed a $65 billion Series H round at a $965 billion post-money valuation, marking what could be the AI startup’s final private fundraise before a highly anticipated IPO.
Just like gold and oil, we’ll soon be able to trade AI token futures
Section titled “Just like gold and oil, we’ll soon be able to trade AI token futures”Large exchanges are designing derivative products around AI tokens, which are increasingly being considered less a computational output and more a raw material input, like electricity or bandwidth.
In just 3 weeks, StrictlyVC is coming to Los Angeles
Section titled “In just 3 weeks, StrictlyVC is coming to Los Angeles”StrictlyVC Los Angeles is on June 18. Join for meaningful networking and fireside chats with leaders from Mach Industries, Shinkei Systems, and more. Register today.
Anthropic releases Opus 4.8 with new ‘dynamic workflow’ tool
Section titled “Anthropic releases Opus 4.8 with new ‘dynamic workflow’ tool”The new Opus model comes with a tool called Dynamic Workflows, for coordinating swarms of subagents.
How long is Anthropic’s lease with SpaceX? Opinions vary
Section titled “How long is Anthropic’s lease with SpaceX? Opinions vary”Elon Musk is publicly reframing xAI’s massive Anthropic compute deal as short-term and cancellable, despite SpaceX’s own S-1 filing describing payments through May 2029.
2026-05-30
Section titled “2026-05-30”生成时间:2026/5/30 09:55:38(UTC: 2026-05-30T01:55:38.275Z)
AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security
Section titled “AgentDoG 1.5: A Lightweight and Scalable Alignment Framework for AI Agent Safety and Security”👍 81 · arXiv
Modern open-world agents such as OpenClaw exhibit powerful cross-environment execution capabilities yet introduce broad new safety risk sources. Meanwhile, advanced frontier AI models drastically lowe…
OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources
Section titled “OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge Sources”👍 54 · arXiv
Real-world information needs require access to structurally diverse knowledge sources, from unstructured text and relational tables to knowledge graphs and property graphs. Existing retrievers, howeve…
CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation
Section titled “CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation”👍 49 · arXiv
Customized image editing aims to equip pre-trained diffusion models with specific visual effects using limited paired data, typically via Low-Rank Adaptation (LoRA). As the number of desired effects g…
minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models
Section titled “minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models”👍 40 · arXiv
Recent video diffusion foundation models have achieved remarkable progress in high-quality video generation, yet turning them into real-time interactive video world models remains challenging. Interac…
YoCausal: How Far is Video Generation from World Model? A Causality Perspective
Section titled “YoCausal: How Far is Video Generation from World Model? A Causality Perspective”👍 32 · arXiv
As video diffusion models (VDMs) advance toward world models, a key question arises: do they truly understand causality, or merely overfit to statistical temporal patterns? Existing benchmarks mostly …
OpenClaw v2026.5.28-beta.4
Section titled “OpenClaw v2026.5.28-beta.4”Highlights
Section titled “Highlights”- Agent and Codex runtime recovery is steadier: subagents keep cwd/workspace separation, hook context stays prompt-local, session locks release on timeout abort, stale restart continu…
链接:https://github.com/openclaw/openclaw/releases/tag/v2026.5.28-beta.4
LangChain langchain-anthropic==1.4.4
Section titled “LangChain langchain-anthropic==1.4.4”Changes since langchain-anthropic==1.4.3
release(anthropic): 1.4.4 (#37757) fix(anthropic): normalize cross-provider tool-call IDs (#37756) test(anthropic): retry integration tests on transient failu…
链接:https://github.com/langchain-ai/langchain/releases/tag/langchain-anthropic%3D%3D1.4.4
vLLM v0.22.0
Section titled “vLLM v0.22.0”Highlights
Section titled “Highlights”This release features 459 commits from 230 contributors (63 new)!
- DeepSeek V4 maturity: DeepSeek V4 received a major hardening pass this cycle — the model was reorganized int…
链接:https://github.com/vllm-project/vllm/releases/tag/v0.22.0
CrewAI 1.14.6
Section titled “CrewAI 1.14.6”What’s Changed
Section titled “What’s Changed”Features
Section titled “Features”- Enhance StdioTransport to prevent environment variable leakage
- Enhance planning configuration and observation handling
- Declare env_vars on DatabricksQueryTool
- A…
链接:https://github.com/crewAIInc/crewAI/releases/tag/1.14.6
OpenAI Codex CLI rust-v0.135.0
Section titled “OpenAI Codex CLI rust-v0.135.0”New Features
Section titled “New Features”codex doctornow reports richer environment, Git, terminal, app-server, and thread inventory diagnostics for support cases. (#24261, #24311, #24305)/statusshows remote conn…
链接:https://github.com/openai/codex/releases/tag/rust-v0.135.0
Show HN: Tiny-vLLM – high performance LLM inference engine in C++ and CUDA
Section titled “Show HN: Tiny-vLLM – high performance LLM inference engine in C++ and CUDA”Article URL: https://github.com/jmaczan/tiny-vllm Comments URL: https://news.ycombinator.com/item?id=48328184 Points: 84
Comments: 7
Section titled “Comments: 7”Robinhood now lets your AI agents trade stocks
Section titled “Robinhood now lets your AI agents trade stocks”Article URL: https://techcrunch.com/2026/05/27/robinhood-now-lets-your-ai-agents-trade-stocks/ Comments URL: https://news.ycombinator.com/item?id=48326659 Points: 88
Comments: 164
Section titled “Comments: 164”Notes from the Mistral AI Now Summit
Section titled “Notes from the Mistral AI Now Summit”Article URL: https://koenvangilst.nl/lab/mistral-ai-now-summit Comments URL: https://news.ycombinator.com/item?id=48325340 Points: 310
Comments: 110
Section titled “Comments: 110”Liquid AI reveals 8B-A1B MoE trained on 38T
Section titled “Liquid AI reveals 8B-A1B MoE trained on 38T”Article URL: https://www.liquid.ai/blog/lfm2-5-8b-a1b Comments URL: https://news.ycombinator.com/item?id=48325306 Points: 152
Comments: 52
Section titled “Comments: 52”CAPTCHAs can still detect AI agents
Section titled “CAPTCHAs can still detect AI agents”Article URL: https://research.roundtable.ai/captchas-detect-ai/ Comments URL: https://news.ycombinator.com/item?id=48324910 Points: 64
Comments: 52
Section titled “Comments: 52”Please Use AI
Section titled “Please Use AI”Article URL: https://shawnsmucker.substack.com/p/please-use-ai Comments URL: https://news.ycombinator.com/item?id=48323101 Points: 719
Comments: 372
Section titled “Comments: 372”Show HN: AISlop, a CLI for catching AI generated code smells
Section titled “Show HN: AISlop, a CLI for catching AI generated code smells”Hi, I’m Kenny, I’ve been building aislop. I starting working on this after using Claude Code, codex and opencode several times and noticing some slops. They aren’t syntax and passes most tests, they are patterns like empty catch blocks, useless comments, duplicated helpers, dead code and many more.
Expertise in the age of AI
Section titled “Expertise in the age of AI”Article URL: https://www.moderndescartes.com/essays/ai_and_expertise/ Comments URL: https://news.ycombinator.com/item?id=48322929 Points: 103