Skip to content

周报 2026-06-01 ~ 2026-06-07

生成时间:2026/6/7 13:24:11(UTC: 2026-06-07T05:24:11.124Z)

本周自动总结未启用或调用失败,以下为原始内容合并。

生成时间:2026/6/1 10:32:40(UTC: 2026-06-01T02:32:40.346Z)

CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation

Section titled “CollectionLoRA: Collecting 50 Effects in 1 LoRA via Multi-Teacher On-Policy Distillation”

👍 53 · arXiv

Customized image editing aims to equip pre-trained diffusion models with specific visual effects using limited paired data, typically via Low-Rank Adaptation (LoRA). As the number of desired effects g…

YoCausal: How Far is Video Generation from World Model? A Causality Perspective

Section titled “YoCausal: How Far is Video Generation from World Model? A Causality Perspective”

👍 41 · arXiv

As video diffusion models (VDMs) advance toward world models, a key question arises: do they truly understand causality, or merely overfit to statistical temporal patterns? Existing benchmarks mostly …

Why Far Looks Up: Probing Spatial Representation in Vision-Language Models

Section titled “Why Far Looks Up: Probing Spatial Representation in Vision-Language Models”

👍 40 · arXiv

Vision-language models (VLMs) achieve strong performance on spatial reasoning benchmarks, yet it remains unclear whether this reflects structured 3D understanding or reliance on statistical shortcuts …

EarlyTom: Early Token Compression Completes Fast Video Understanding

Section titled “EarlyTom: Early Token Compression Completes Fast Video Understanding”

👍 27 · arXiv

Video large language models (Video-LLMs) have demonstrated strong capabilities in video understanding tasks. However, their practical deployment is still hindered by the inefficiency introduced by pro…

Skill0.5: Joint Skill Internalization and Utilization for Out-of-Distribution Generalization in Agentic Reinforcement Learning

Section titled “Skill0.5: Joint Skill Internalization and Utilization for Out-of-Distribution Generalization in Agentic Reinforcement Learning”

👍 22 · arXiv

Equipping large language models with explicit skills has emerged as a promising paradigm for enabling autonomous agents to solve complex tasks. Agent skills can be inherently divided into general skil…

  • Agents and CLI-backed runtimes recover more cleanly from interrupted tool calls, stale session bindings, compaction handoffs, and media delivery retries. (#88129, #88136, #88141, #8…

链接https://github.com/openclaw/openclaw/releases/tag/v2026.5.31-beta.4

Release 0.136.0-alpha.2

链接https://github.com/openai/codex/releases/tag/rust-v0.136.0-alpha.2

Article URL: https://darylcecile.net/notes/speed-of-prototyping-age-of-ai Comments URL: https://news.ycombinator.com/item?id=48347153 Points: 118

来源Hacker News AI

Article URL: https://github.com/pewdiepie-archdaemon/odysseus Comments URL: https://news.ycombinator.com/item?id=48346693 Points: 119

来源Hacker News AI

The solution might be cancelling my AI subscription

Section titled “The solution might be cancelling my AI subscription”

Article URL: https://thoughts.hmmz.org/2026-05-31.html Comments URL: https://news.ycombinator.com/item?id=48345896 Points: 343

来源Hacker News AI

The people who actually want AI to replace humanity

Section titled “The people who actually want AI to replace humanity”

Article URL: https://www.vox.com/future-perfect/489976/ai-successionism-transhumanism-posthumanism Comments URL: https://news.ycombinator.com/item?id=48345881 Points: 73

来源Hacker News AI

AI grifters are creating fake Black people to sell Shein junk

Section titled “AI grifters are creating fake Black people to sell Shein junk”

Article URL: https://www.theverge.com/ai-artificial-intelligence/938844/ai-tiktok-shop-blackface-shein-dropshipping Comments URL: https://news.ycombinator.com/item?id=48341921 Points: 50

来源Hacker News AI

To have a moral stance on AI is to be an outcast, and it sucks

Section titled “To have a moral stance on AI is to be an outcast, and it sucks”

Article URL: https://musings.martyn.berlin/to-have-a-moral-stance-on-ai-is-to-be-an-outcast-and-it-sucks Comments URL: https://news.ycombinator.com/item?id=48337676 Points: 141

来源Hacker News AI

AI job grief: A psychological crisis hitting tech workers

Section titled “AI job grief: A psychological crisis hitting tech workers”

Article URL: https://jackmaguire.org/blog/ai-job-grief/ Comments URL: https://news.ycombinator.com/item?id=48336760 Points: 190

来源Hacker News AI

Anthropic surpasses OpenAI to become most valuable AI startup

Section titled “Anthropic surpasses OpenAI to become most valuable AI startup”

Article URL: https://qazinform.com/news/anthropic-surpasses-openai-to-become-worlds-most-valuable-ai-startup Comments URL: https://news.ycombinator.com/item?id=48336233 Points: 415

来源Hacker News AI


生成时间:2026/6/2 10:31:12(UTC: 2026-06-02T02:31:12.467Z)

GrepSeek: Training Search Agents for Direct Corpus Interaction

Section titled “GrepSeek: Training Search Agents for Direct Corpus Interaction”

👍 88 · arXiv

Large Language Model (LLM) search agents have shown strong promise for knowledge-intensive language tasks through multiple rounds of reasoning and information retrieval. Most existing systems access i…

Trust-Region Behavior Blending for On-Policy Distillation

Section titled “Trust-Region Behavior Blending for On-Policy Distillation”

👍 51 · arXiv

On-policy distillation (OPD) trains a student on prefixes sampled from its own policy while matching a stronger teacher. This addresses the prefix mismatch of offline distillation, but early student r…

👍 35 · arXiv

We present Mellum 2, an open-weight 12B-parameter Mixture-of-Experts (MoE) language model with 2.5B active parameters per token. Mellum 2 is a general-purpose language model specialized in software en…

Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer

Section titled “Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer”

👍 27 · arXiv

Real-time and accurate spatial audio generation is pivotal for delivering an immersive experience. However, existing spatial audio synthesis technologies are often encumbered by a tradeoff between gen…

Comprehensive Benchmarking of Long-Form Speech Generation in Diverse Scenarios

Section titled “Comprehensive Benchmarking of Long-Form Speech Generation in Diverse Scenarios”

👍 25 · arXiv

Recent advances in speech generation have enabled high-fidelity synthesis, yet systematic evaluation of models under long-context conditions remains largely underexplored. A comprehensive evaluation b…

  • Agents and CLI-backed runtimes recover more cleanly from interrupted tool calls, stale session bindings, compaction handoffs, and media delivery retries. (#88129, #88136…

链接https://github.com/openclaw/openclaw/releases/tag/v2026.6.1-beta.2

  • TUI markdown now keeps web links clickable with OSC 8 metadata, and cramped tables switch to readable key/value records without losing link targets. (#24472, #24636, #24825)
  • Sessi…

链接https://github.com/openai/codex/releases/tag/rust-v0.136.0

Alphabet plans to raise $80B to pay for AI buildout

Section titled “Alphabet plans to raise $80B to pay for AI buildout”

“The company is experiencing strong demand for its AI solutions and services from enterprises and consumers, at levels that are exceeding the company’s available supply,” Alphabet said in its statement.

来源TechCrunch AI

Nvidia chases $200B CPU market with AI agent PCs from Microsoft, Dell, and HP

Section titled “Nvidia chases $200B CPU market with AI agent PCs from Microsoft, Dell, and HP”

If Nvidia has cracked a way to bring AI agents easily, safely, and usefully to the masses, it could — and should — be big.

来源TechCrunch AI

Florida sues OpenAI, Sam Altman, in first-of-its-kind lawsuit over violent incidents

Section titled “Florida sues OpenAI, Sam Altman, in first-of-its-kind lawsuit over violent incidents”

The lawsuit partially revolves around a shooting at Florida State University last year, and ChatGPT’s alleged role in the incident.

来源TechCrunch AI

Water access is now a risk factor in SpaceX’s IPO

Section titled “Water access is now a risk factor in SpaceX’s IPO”

The company says it needs “significant” water resources to cool its data centers, and that access to abundant, affordable water is a challenge.

来源TechCrunch AI

Anthropic, now an AI powerhouse that has landed top-tier enterprise customers, was once considered an underdog in the emerging world of large language models.

来源TechCrunch AI

This AI weather startup is out-forecasting government agencies

Section titled “This AI weather startup is out-forecasting government agencies”

WindBorne benefits from its unique combination of model-building and data collection. The company now has about 400 balloons in flight gathering sensor readings at any given time, launched from 15 sites around the globe. The advances in its current model come from improvements in how the data collec

来源TechCrunch AI

DuckDuckGo makes its ‘no-AI’ search engine easier to access as its traffic booms

Section titled “DuckDuckGo makes its ‘no-AI’ search engine easier to access as its traffic booms”

Alternative search engine DuckDuckGo launches ‘no AI’ web extensions for Chrome and Firefox users.

来源TechCrunch AI

Erin Brockovich takes aim at data center secrecy

Section titled “Erin Brockovich takes aim at data center secrecy”

Environmental activist Erin Brockovich has a new mission.

来源TechCrunch AI


生成时间:2026/6/3 10:37:21(UTC: 2026-06-03T02:37:21.585Z)

A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks

Section titled “A Matter of TASTE: Improving Coverage and Difficulty of Agent Benchmarks”

👍 55 · arXiv

As agent capabilities advance, existing benchmarks, such as τ^2-Bench, are becoming increasingly saturated. Yet constructing new benchmark tasks remains complex, costly, and labor-intensive. Moreover,…

Domino: Decoupling Causal Modeling from Autoregressive Drafting in Speculative Decoding

Section titled “Domino: Decoupling Causal Modeling from Autoregressive Drafting in Speculative Decoding”

👍 32 · arXiv

Speculative decoding accelerates LLM inference by drafting multiple tokens and verifying them in parallel with the target model. However, its practical speedup is constrained by the trade-off between …

Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses

Section titled “Harness-1: Reinforcement Learning for Search Agents with State-Externalizing Harnesses”

👍 31 · arXiv

Search agents are often trained as policies over growing transcripts: the model must decide how to search while also remembering what it has seen, which evidence is useful, which constraints remain op…

Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs

Section titled “Linear Ensembles Wash Away Watermarks: On the Fragility of Distributional Perturbations in LLMs”

👍 25 · arXiv

Watermarking embeds statistical signatures in AI-generated text for detection and attribution. We reveal a fundamental vulnerability: when users access multiple models (today’s reality), watermarks tr…

LVSA: Training-Free Sparse Attention for Long Video Diffusion

Section titled “LVSA: Training-Free Sparse Attention for Long Video Diffusion”

👍 12 · arXiv

Dense self-attention is the compute and quality bottleneck of long-video diffusion inference: cost grows quadratically with the sequence length, and beyond the training horizon the model converges to …

  • Agents and CLI-backed runtimes recover more cleanly from interrupted tool calls, stale session bindings, compaction handoffs, and media delivery retries. (#88129, #88136, #88141, #8…

链接https://github.com/openclaw/openclaw/releases/tag/v2026.6.1-beta.2

Changes since langchain==1.3.3

release(langchain): 1.3.4 (#37861) fix(langchain): improve HITL rejection guidance (#37859)…

链接https://github.com/langchain-ai/langchain/releases/tag/langchain%3D%3D1.3.4

链接https://github.com/ollama/ollama/releases/tag/v0.30.2

Release 0.137.0-alpha.4

链接https://github.com/openai/codex/releases/tag/rust-v0.137.0-alpha.4

More than 6 out of 10 people turn to AI for psychological support

Section titled “More than 6 out of 10 people turn to AI for psychological support”

Article URL: https://www.axa.com/en/press/press-releases/2026-mind-health-report Comments URL: https://news.ycombinator.com/item?id=48377854 Points: 58

来源Hacker News AI

AI outperforms law professors in Stanford Law study

Section titled “AI outperforms law professors in Stanford Law study”

https://law.stanford.edu/wp-content/uploads/2026/06/salinas_…

Comments URL: https://news.ycombinator.com/item?id=48377761 Points: 104

来源Hacker News AI

Article URL: https://julienreszka.com/blog/rss-is-back-ai-agents-are-reading-it/ Comments URL: https://news.ycombinator.com/item?id=48375673 Points: 60

来源Hacker News AI

Uber caps employee AI spending after blowing through budget in four months

Section titled “Uber caps employee AI spending after blowing through budget in four months”

Article URL: https://techcrunch.com/2026/06/02/uber-caps-employee-ai-spending-after-blowing-through-budget-in-four-months/ Comments URL: https://news.ycombinator.com/item?id=48375544 Points: 61

来源Hacker News AI

Microsoft announces Scout, an autonomous AI agent built on OpenClaw

Section titled “Microsoft announces Scout, an autonomous AI agent built on OpenClaw”

https://www.microsoft.com/en-us/microsoft-365/blog/2026/06/0…https://www.404media.co/microsoft-wants-to-make-people-addic…https://www.wired.com/story/meet-microsoft-scout-your-ai-cow… (https://web.archive.org/web/20260602180553/https://www.wired…)

Comments URL: https://news.ycombinator.com/

来源Hacker News AI

Trump signs downsized AI order after weeks of reversals

Section titled “Trump signs downsized AI order after weeks of reversals”

https://www.whitehouse.gov/presidential-actions/2026/06/prom…https://www.nytimes.com/2026/06/02/technology/trump-executiv

Comments URL: https://news.ycombinator.com/item?id=48372628 Points: 178

来源Hacker News AI

Americans don’t know how to fight AI so they’re fighting data centers

Section titled “Americans don’t know how to fight AI so they’re fighting data centers”

Article URL: https://www.vox.com/future-perfect/490350/data-center-moratoria-ai-backlash Comments URL: https://news.ycombinator.com/item?id=48371592 Points: 114

来源Hacker News AI

Article URL: https://www.wheresyoured.at/ai-doesnt-have-roi/ Comments URL: https://news.ycombinator.com/item?id=48370437 Points: 58

来源Hacker News AI


生成时间:2026/6/4 10:33:10(UTC: 2026-06-04T02:33:10.245Z)

OCC-RAG: Optimal Cognitive Core for Faithful Question Answering

Section titled “OCC-RAG: Optimal Cognitive Core for Faithful Question Answering”

👍 73 · arXiv

Recent progress in the development of language models has been defined by scale, with each generation absorbing more of the world’s knowledge into its weights. However, many practical applications ben…

From Activation to Causality: Discovery of Causal Visual Representations in the Human Brain

Section titled “From Activation to Causality: Discovery of Causal Visual Representations in the Human Brain”

👍 42 · arXiv

Identifying which brain regions represent a visual concept in the human brain is a central challenge in neuroscience. Existing approaches have localized coarse functional regions (e.g., faces, places)…

KVarN: Variance-Normalized KV-Cache Quantization Mitigates Error Accumulation in Reasoning Tasks

Section titled “KVarN: Variance-Normalized KV-Cache Quantization Mitigates Error Accumulation in Reasoning Tasks”

👍 27 · arXiv

Test-time scaling is a powerful approach to obtain better reasoning in large language models, but it becomes memory-bottlenecked during long-horizon decoding, as the KV-cache grows. KV-cache quantizat…

A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL

Section titled “A Local Perturbation Theory for Cross-Domain Interference and Recovery in Multi-Domain RL”

👍 24 · arXiv

Reinforcement learning (RL) post-training improves large language models (LLMs) on individual domains such as mathematical reasoning, code generation, question answering, and creative writing (CW), bu…

World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning

Section titled “World Models Meet Language Models: On the Complementarity of Concrete and Abstract Reasoning”

👍 23 · arXiv

World models and multimodal large language models (MLLMs) provide complementary capabilities for predicting future outcomes from static visual observations. World models can generate concrete visual r…

  • Plugin and skill installs now use an operator install policy instead of the old dangerous-code scanner path, with clearer doctor, CLI, ClawHub, and troubleshooting surfa…

链接https://github.com/openclaw/openclaw/releases/tag/v2026.6.2-beta.1

Changes since langchain-deepseek==1.0.1

chore(infra): bump langchain-tests floor to 1.1.9 (#37610) chore: bump idna from 3.10 to 3.15 in /libs/partners/deepseek (#37560) ci(infra): harden Dependabo…

链接https://github.com/langchain-ai/langchain/releases/tag/langchain-deepseek%3D%3D1.1.0

链接https://github.com/ollama/ollama/releases/tag/v0.30.4

  • Add crew trained agents file support
  • Add native Snowflake Cortex LLM provider
  • Add Databricks integration guide
  • Add Snowflake integration guide

链接https://github.com/crewAIInc/crewAI/releases/tag/1.14.7a1

链接https://github.com/aaif-goose/goose/releases/tag/v1.37.0

  • TUI controls now support F13-F24 keybindings, paste in searchable menus, and a compact reasoning-only status/title item (#25329, #25400, #25504).
  • Enterprise/admin flows now show mo…

链接https://github.com/openai/codex/releases/tag/rust-v0.137.0

Lovable signs multiyear deal with Google Cloud to up usage 5x, source says

Section titled “Lovable signs multiyear deal with Google Cloud to up usage 5x, source says”

Lovable and Google signed an expanded multiyear deal that involves a 5x expansion of Lovable’s footprint on Google Cloud, and expanded access to Anthropic Claude.

来源TechCrunch AI

Alphabet’s record-breaking $85B raise for Google’s AI business is a helluva good signal

Section titled “Alphabet’s record-breaking $85B raise for Google’s AI business is a helluva good signal”

If Alphabet’s record-breaking $85 billion stock sale signals investor appetite for AI-related offerings, we can see that investors are ready to chow.

来源TechCrunch AI

Google’s Dreambeans, its weirdest-named AI tool to date, will turn your life into a cartoon

Section titled “Google’s Dreambeans, its weirdest-named AI tool to date, will turn your life into a cartoon”

Dreambeans is a curated list of AI-illustrated “stories” culled from the personal data in your Google account.

来源TechCrunch AI

Amazon will show AI product images when you search for some reason

Section titled “Amazon will show AI product images when you search for some reason”

Amazon will use visual search and AI to show AI-generated product images that match your search queries. The retailer says it will help guide users to products.

来源TechCrunch AI

These two founders left Goldman and Meta to build voice AI for markets everyone else overlooked

Section titled “These two founders left Goldman and Meta to build voice AI for markets everyone else overlooked”

The startup’s own stack for Africa and Middle East is now handling more than 17,000 calls per day.

来源TechCrunch AI

Publishers will be able to opt out of AI Search, thanks to new regulation

Section titled “Publishers will be able to opt out of AI Search, thanks to new regulation”

U.K. regulators are requiring Google offer a tool allowing website publishers to opt-out of generative AI search features. The option will be tested in the U.K. then rolled out globally.

来源TechCrunch AI

Meta’s AI agent for WhatsApp Business is now available globally

Section titled “Meta’s AI agent for WhatsApp Business is now available globally”

WhatsApp will charge businesses for using its AI agent based on token usage.

来源TechCrunch AI

Coralogix raises $200M on bet that someone needs to watch the AI agents

Section titled “Coralogix raises $200M on bet that someone needs to watch the AI agents”

Coralogix is among a growing number of infrastructure firms betting that as AI systems move into production, demand will rise for tools that can monitor their behavior, troubleshoot failures, and provide the operational data needed to keep them running reliably.

来源TechCrunch AI


生成时间:2026/6/5 10:07:55(UTC: 2026-06-05T02:07:55.012Z)

👍 88 · arXiv

Audio is an inherently interactive modality, yet today’s Large Audio Language Models (LALMs) are offline, and streaming audio models each handle only a single task such as streaming ASR or voice chatt…

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

Section titled “Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories”

👍 45 · arXiv

Deep-research agents solve tasks through long trajectories of search, tool use, evidence inspection, and answer synthesis. Evaluation based on final answers shows whether an agent succeeds, but not wh…

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

Section titled “Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning”

👍 35 · arXiv

Rubric-based reinforcement learning (RL) uses an LLM-as-a-Judge (LaaJ) to score model outputs according to rubrics as rewards. However, policy models may exploit latent biases in the judge, leading to…

OVO-S-Bench: A Hierarchical Benchmark for Streaming Spatial Intelligence in Multimodal LLMs

Section titled “OVO-S-Bench: A Hierarchical Benchmark for Streaming Spatial Intelligence in Multimodal LLMs”

👍 29 · arXiv

Multimodal agents in robotics, AR, and autonomous driving must reason about places and layouts from continuous egocentric streams, often using evidence outside the current view. Existing benchmarks ei…

ThoughtFold: Folding Reasoning Chains via Introspective Preference Learning

Section titled “ThoughtFold: Folding Reasoning Chains via Introspective Preference Learning”

👍 24 · arXiv

Large Reasoning Models (LRMs) have achieved remarkable progress thanks to Reinforcement Learning with Verifiable Rewards (RLVR) on Chain-of-Thoughts (CoTs). However, since long CoTs naturally contain …

  • Plugin and skill installs now use an operator install policy instead of the old dangerous-code scanner path, with clearer doctor, CLI, ClawHub, and troubleshooting surfa…

链接https://github.com/openclaw/openclaw/releases/tag/v2026.6.2-beta.1

Changes since langchain-deepseek==1.0.1

chore(infra): bump langchain-tests floor to 1.1.9 (#37610) chore: bump idna from 3.10 to 3.15 in /libs/partners/deepseek (#37560) ci(infra): harden Dependabo…

链接https://github.com/langchain-ai/langchain/releases/tag/langchain-deepseek%3D%3D1.1.0

Full Changelog: https://g

链接https://github.com/ollama/ollama/releases/tag/v0.30.5

  • Add crew trained agents file support
  • Add native Snowflake Cortex LLM provider
  • Add Databricks integration guide
  • Add Snowflake integration guide

链接https://github.com/crewAIInc/crewAI/releases/tag/1.14.7a1

链接https://github.com/aaif-goose/goose/releases/tag/v1.37.0

Release 0.138.0-alpha.4

链接https://github.com/openai/codex/releases/tag/rust-v0.138.0-alpha.4

Anthropic’s open-source framework for AI-powered vulnerability discovery

Section titled “Anthropic’s open-source framework for AI-powered vulnerability discovery”

Article URL: https://github.com/anthropics/defending-code-reference-harness Comments URL: https://news.ycombinator.com/item?id=48403980 Points: 273

来源Hacker News AI

When AI Builds Itself: Our progress toward recursive self-improvement

Section titled “When AI Builds Itself: Our progress toward recursive self-improvement”

Article URL: https://www.anthropic.com/institute/recursive-self-improvement Comments URL: https://news.ycombinator.com/item?id=48400842 Points: 336

来源Hacker News AI

Google employees internally share memes about how its AI sucks

Section titled “Google employees internally share memes about how its AI sucks”

Article URL: https://www.404media.co/google-employees-internally-share-memes-about-how-its-ai-sucks/ Comments URL: https://news.ycombinator.com/item?id=48400311 Points: 147

来源Hacker News AI

The LLM warnings Google fired Timnit Gebru over have all come true

Section titled “The LLM warnings Google fired Timnit Gebru over have all come true”

Article URL: https://www.tumblr.com/dreaminginthedeepsouth/817865966907228160/darren-oconnor-timnit-gebru-was-fired-from Comments URL: https://news.ycombinator.com/item?id=48400213 Points: 105

来源Hacker News AI

Article URL: https://www.ashbyhq.com/blog/engineering/ai-ashby-engineering-and-the-future Comments URL: https://news.ycombinator.com/item?id=48399528 Points: 56

来源Hacker News AI

Show HN: Boxes.dev: ditch localhost; run Claude Code and Codex in the cloud

Section titled “Show HN: Boxes.dev: ditch localhost; run Claude Code and Codex in the cloud”

Hi HN, we’re Nick and Drew, and we’re building boxes.dev – the first cloud-only agentic dev environment (ADE) that gives every Codex and Claude Code agent its own cloud computer.We’re two engineers who previously built Gem (co-founder/CTO and first hire), and we spent the last year coding almost exc

来源Hacker News AI

The ways we contain Claude across products

Section titled “The ways we contain Claude across products”

Article URL: https://www.anthropic.com/engineering/how-we-contain-claude Comments URL: https://news.ycombinator.com/item?id=48392082 Points: 221

来源Hacker News AI

Failing grades soar with AI usage, dwindling math skills in Berkeley CS classes

Section titled “Failing grades soar with AI usage, dwindling math skills in Berkeley CS classes”

Article URL: https://www.dailycal.org/news/campus/academics/failing-grades-soar-as-professors-see-greater-ai-usage-dwindling-math-skills-in-uc-berkeley/article_16fad0bf-02cb-4b8c-8d88-888ffd9f8608.html Comments URL: https://news.ycombinator.com/item?id=48392004 Points: 733

来源Hacker News AI


生成时间:2026/6/6 09:59:19(UTC: 2026-06-06T01:59:19.490Z)

Code2LoRA: Hypernetwork-Generated Adapters for Code Language Models under Software Evolution

Section titled “Code2LoRA: Hypernetwork-Generated Adapters for Code Language Models under Software Evolution”

👍 46 · arXiv

Code language models need repository-level context to resolve imports, APIs, and project conventions. Existing methods inject this knowledge as long inputs (retrieved through RAG or dependency analysi…

ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time?

Section titled “ArcANE: Do Role-Playing Language Agents Stay in Character at the Right Time?”

👍 40 · arXiv

Role-playing language agents (RPLAs) should play characters whose values and behavior evolve as the story progresses, not maintain a fixed persona. Existing benchmarks measure factual recall at a give…

TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration

Section titled “TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration”

👍 36 · arXiv

Agents are widely deployed as assistants over documents, tools, and code. However, they typically act only on explicit user requests, which surface only the problems the user has noticed, while many o…

AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints

Section titled “AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints”

👍 32 · arXiv

Planning for real-world problems by language models often involves both world and user constraints, which may not be fully specified upfront and are progressively disclosed through interaction. Howeve…

VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding

Section titled “VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding”

👍 31 · arXiv

We introduce VideoKR, the first large-scale training corpus specifically designed to strengthen knowledge- and reasoning-intensive video understanding. It comprises 315K video reasoning examples over …

Changes since langchain-perplexity==1.3.1

release(perplexity): 1.3.2 (#37925) fix(perplexity): serialize ToolMessage and AIMessage.tool_calls (#37911)…

链接https://github.com/langchain-ai/langchain/releases/tag/langchain-perplexity%3D%3D1.3.2

This release features 8 commits from 6 contributors (1 new)!

v0.22.1 is a patch release on top of v0.22.0 with targeted bug fixes plus a couple of additions: new model support for …

链接https://github.com/vllm-project/vllm/releases/tag/v0.22.1

  • ollama launch omp now integrates with Oh My Pi, an AI coding agent with IDE integration
  • MLX embedding layers now use NVFP4 global scale for improved quantiz…

链接https://github.com/ollama/ollama/releases/tag/v0.30.6

  • Add conversational flow traces support.
  • Update conversational flow documentation to utilize handle_turn.
  • Surface real finish_reason, sampling parameters, and …

链接https://github.com/crewAIInc/crewAI/releases/tag/1.14.7a2

链接https://github.com/openai/codex/releases/tag/rusty-v8-v149.2.0

Microsoft wants users to be addicted to Scout, their AI personal assistant

Section titled “Microsoft wants users to be addicted to Scout, their AI personal assistant”

Article URL: https://disassociated.com/microsoft-users-addicted-ai-personal-assistant/ Comments URL: https://news.ycombinator.com/item?id=48419023 Points: 67

来源Hacker News AI

Article URL: https://elijahpotter.dev/articles/hacker-news-sans-AI Comments URL: https://news.ycombinator.com/item?id=48417916 Points: 148

来源Hacker News AI

Leak Reveals Microsoft Wants Its AI to Be ‘Addictive’

Section titled “Leak Reveals Microsoft Wants Its AI to Be ‘Addictive’”

Article URL: https://kotaku.com/microsoft-ai-scout-addictive-satya-nadella-404-media-copilot-2000702924 Comments URL: https://news.ycombinator.com/item?id=48413924 Points: 66

来源Hacker News AI

Ask HN: What is your (AI) dev tech stack / workflow?

Section titled “Ask HN: What is your (AI) dev tech stack / workflow?”

Hello, happy Friday!I am looking to do some in-person “developer boot-up” workshops, and seek your suggestions for “modern tooling”.The background of the participants range from motivated newbie (“I heard you can make your own app with AI!”) to existing software developers who want to get up to spee

来源Hacker News AI

Article URL: https://alexispurslane.github.io/rsync-analysis/ Comments URL: https://news.ycombinator.com/item?id=48411635 Points: 307

来源Hacker News AI

Programmers will document for Claude, but not for each other

Section titled “Programmers will document for Claude, but not for each other”

Article URL: https://blog.plover.com/2026/03/09/#documentation-wins-2 Comments URL: https://news.ycombinator.com/item?id=48411510 Points: 176

来源Hacker News AI

Show HN: Lowfat – pluggable CLI filter that saved 91.8% of my LLM tokens

Section titled “Show HN: Lowfat – pluggable CLI filter that saved 91.8% of my LLM tokens”

Hi HN, not sure if anyone would be interested, but just wanted to share that I’ve been maintaining my small tool called ‘lowfat’ that helps me filters some of my verbose CLI output. It’s a single binary, works as an agent hook or a shell wrapper. It has a plugin system to customize filters per comma

来源Hacker News AI

Fine-tuning an LLM to write docs like it’s 1995

Section titled “Fine-tuning an LLM to write docs like it’s 1995”

Article URL: https://passo.uno/fine-tuning-docs-llm/ Comments URL: https://news.ycombinator.com/item?id=48408442 Points: 176

来源Hacker News AI