Skip to content

AI 速递 2026-06-13

生成时间:2026/6/13 10:07:34(UTC: 2026-06-13T02:07:34.635Z)

EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments

Section titled “EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments”

👍 105 · arXiv

Large language model (LLM) agents have achieved strong performance on a wide range of benchmarks, yet most evaluations assume static environments. In contrast, real-world deployment is inherently dyna…

👍 83 · arXiv

Ultra-long-context capability is becoming indispensable for frontier LLMs: agentic workflows, repository-scale code reasoning, and persistent memory all require the model to jointly attend over hundre…

SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning

Section titled “SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning”

👍 80 · arXiv

Spatial reasoning, the ability to determine where objects are, how they relate, and how they move in 3D, remains a fundamental challenge for vision-language models (VLMs). Tool-augmented agents attemp…

InterleaveThinker: Reinforcing Agentic Interleaved Generation

Section titled “InterleaveThinker: Reinforcing Agentic Interleaved Generation”

👍 73 · arXiv

Recent image generators have demonstrated impressive photorealism and instruction-following capabilities in single-image generation and editing. However, constrained by their architectures, they canno…

Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?

Section titled “Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?”

👍 71 · arXiv

Multimodal Large Language Models (MLLMs) have demonstrated remarkable success in visual understanding, yet their performance degrades significantly under real-world visual corruptions. While existing …

  • Security boundaries are substantially tighter across transcripts, sandbox binds, host environment inheritance, MCP stdio, Codex HTTP access, native search policy, elevated sender ch…

链接https://github.com/openclaw/openclaw/releases/tag/v2026.6.6

Changes since langchain==1.3.8

release(anthropic): 1.4.6 (#38105) release(langchain): 1.3.9 (#38104) fix(langchain,anthropic): confine file-search results and tighten anthropic allowed_prefixes (#3…

链接https://github.com/langchain-ai/langchain/releases/tag/langchain%3D%3D1.3.9

Please note that Minimax M3 is not yet supported in this version. Please follow vLLM recipe for usage guides for M3.

链接https://github.com/vllm-project/vllm/releases/tag/v0.23.0

  • Fixed ollama launch selecting the wrong provider in some cases
  • Improved prompt caching by decoupling it from context shift for better KV cache reuse
  • More stable MLX infere…

链接https://github.com/ollama/ollama/releases/tag/v0.30.8

  • Add pluggable default backends for memory, knowledge, rag, and flow.
  • Surface real finish_reason, sampling params, and response.id on LLM events.
  • Type DSL triggers…

链接https://github.com/crewAIInc/crewAI/releases/tag/1.14.7

Release 0.140.0-alpha.17

链接https://github.com/openai/codex/releases/tag/rust-v0.140.0-alpha.17

How to setup a local coding agent on macOS

Section titled “How to setup a local coding agent on macOS”

Article URL: https://ikyle.me/blog/2026/how-to-setup-a-local-coding-agent-on-macos Comments URL: https://news.ycombinator.com/item?id=48507020 Points: 266

来源Hacker News AI

Show HN: Script to bulk delete Claude chats from the web UI

Section titled “Show HN: Script to bulk delete Claude chats from the web UI”

I haven’t found a way to delete all chats in bulk like you can on Chatgpt. With Claude, you have to scroll to the bottom, select everything, and delete. The problem is, if you have a lot of chats, it becomes impossible. I created this script. It does it alone. I hope it helps someone.(conversations

来源Hacker News AI

Slightly reducing the sloppiness of AI generated front end

Section titled “Slightly reducing the sloppiness of AI generated front end”

Article URL: https://envs.net/~volpe/blog/posts/reduce-slop.html Comments URL: https://news.ycombinator.com/item?id=48504912 Points: 165

来源Hacker News AI

AI agent bankrupted their operator while trying to scan DN42

Section titled “AI agent bankrupted their operator while trying to scan DN42”

Article URL: https://lantian.pub/en/article/fun/ai-agent-bankrupted-their-operator-scan-dn42lantian.lantian/ Comments URL: https://news.ycombinator.com/item?id=48500012 Points: 1394

来源Hacker News AI

Article URL: https://simonwillison.net/2026/Jun/11/fable-is-relentlessly-proactive/ Comments URL: https://news.ycombinator.com/item?id=48498573 Points: 727

来源Hacker News AI

Shall we play a game? My AI nuclear simulation

Section titled “Shall we play a game? My AI nuclear simulation”

https://arxiv.org/pdf/2602.14740

Comments URL: https://news.ycombinator.com/item?id=48495575 Points: 204

来源Hacker News AI

Article URL: https://xenodium.com/agent-shell-0-55-updates Comments URL: https://news.ycombinator.com/item?id=48493273 Points: 62

来源Hacker News AI

Claude Fable 5: mid-tier results on coding tasks

Section titled “Claude Fable 5: mid-tier results on coding tasks”

Article URL: https://www.endorlabs.com/learn/claude-fable-5-mythos-grade-hype Comments URL: https://news.ycombinator.com/item?id=48492210 Points: 394

来源Hacker News AI