This commit is contained in:
2026-06-01 10:46:01 +08:00
parent 2faf4bb002
commit e96b955fda
221 changed files with 10219 additions and 332 deletions

121
log.md
View File

@@ -9,6 +9,121 @@
## [2026-05-31] ingest | ToolCUA: Optimal GUI-Tool Path Orchestration (arXiv:2605.12481, 2026)
- 添加论文 [[toolcua-optimal-gui-tool-orchestration]]: "ToolCUA: 面向CUA的最优GUI-Tool路径编排" — 通过合成数据+分阶段RL学习GUI-Tool杂交动作空间的最优切换策略
- 新增 8 个概念页: [[computer-use-agents]], [[gui-tool-hybrid-action-space]], [[optimal-gui-tool-path-selection]], [[interleaved-gui-tool-trajectory-scaling]], [[tool-bootstrapped-rft]], [[tool-efficient-path-reward]], [[osworld-mcp]], [[next-state-grounding]]
- 来源: https://arxiv.org/abs/2605.12481
- 代码: https://github.com/X-PLUG/ToolCUA
## [2026-05-30] — ingest-supplement | Agent Harness Engineering: A Survey (TMLR 2026)
- 补充 8 个概念页:[[agent-observability]], [[agent-verification]], [[agent-governance]], [[practitioner-research-gap]], [[agent-sandbox]], [[context-drift]], [[three-engineering-phases]], [[multi-agent-orchestration]]
- 保存完整 PDF 至 raw/papers/agent-harness-engineering-survey-2026.pdf
- 原始论文已于 2026-05-23 部分集成paper 主页面 + 17 个核心概念),本次补充 ETCLOVG 独立层概念和跨层概念
## [2026-05-29] ingest | Agent Symbolic Learning (arXiv:2406.18532, arXiv cs.CL 2024)
- 添加论文 [[zhou-agent-symbolic-learning-2024]]: "Symbolic Learning Enables Self-Evolving Agents" — Agent作为符号网络模仿BP+GD实现自进化SkillOpt/Heuristic Learning的重要前驱
- 新增 6 个概念页: [[agent-symbolic-learning]], [[symbolic-network]], [[language-gradient]], [[language-loss]], [[symbolic-backpropagation]], [[self-evolving-agents]]
- 来源: https://arxiv.org/abs/2406.18532
- 作者: Wangchunshu Zhou et al. (AIWaves)
## [2026-05-29] ingest | UltraData L3开源与数据分级治理 (Datawhale, 面壁智能)
- 添加文章 [[ultradata-l3-open-source-2026]]: "UltraData面壁智能L3数据开源与L0-L4数据分级治理体系" — 600B合成数据+千万SFTMiniCPM5-1B登顶
- 新增 6 个概念页: [[data-hierarchical-governance]], [[ultradata]], [[synthetic-data-qa-generation]], [[stage-matched-data-config]], [[deep-thinking-sft]], [[data-quality-over-scale]]
- 来源: https://mp.weixin.qq.com/s/5jV2jYuXJloKX5IWCzrSpw
## [2026-05-29] ingest | SkillOpt深度解读 (微信公众号, 吕明, ~1.2万字)
- 添加文章 [[lyu-skillopt-deep-dive-2026]]: "SkillOpt深度解读自进化Agent技能的'反向传播'与工程化Continued Evolve" — 文本vs权重优化的深层分野、受控自主性、数据飞轮、双层RL
- 新增 5 个概念页: [[text-vs-weight-optimization]], [[controlled-autonomy]], [[skill-data-flywheel]], [[skill-ecosystem]], [[dual-layer-rl]]
- 来源: https://mp.weixin.qq.com/s/s__fdyXQG932SavQeeugcw
## [2026-05-29] ingest | SkillOpt (arXiv:2605.23904, arXiv cs.AI 2026)
- 添加论文 [[yang-skillopt-2026]]: "SkillOpt: Executive Strategy for Self-Evolving Agent Skills" — 首个系统性 Agent Skill 文本空间优化器52/52 best平均+23.5 pts
- 新增 7 个概念页: [[skillopt]], [[text-space-optimizer]], [[textual-learning-rate]], [[held-out-validation-gate]], [[rejected-edit-buffer]], [[slow-meta-update]], [[skill-as-external-state]]
- 来源: https://arxiv.org/abs/2605.23904
- 作者: Yifan Yang et al. (Microsoft, SJTU, Tongji, Fudan)
## [2026-05-29] ingest | Model与Harness的关系演进 (微信公众号, 吕明)
- 添加文章 [[lyu-model-harness-evolution-2026]]: "Model与Harness的关系演进从AutoHarness到Heuristic Learning" — GenAI三支柱、策略与工程统一、编译型AI新范式
- 新增 6 个概念页: [[model-harness-relationship]], [[harness-engineering]], [[heuristic-learning]], [[strategy-engineering-unification]], [[compiled-ai-paradigm]], [[generative-general-unification]]
- 来源: https://mp.weixin.qq.com/s/PglkqhlSoI7LEOb3AOHl8g
- 作者: 吕明
## [2026-05-29] ingest | AutoHarness (arXiv:2603.03329, arXiv cs.CL 2026)
- 添加论文 [[lou-autoharness-2026]]: "AutoHarness: improving LLM agents by automatically synthesizing a code harness" — 自动合成代码harness消除Agent非法动作Code-as-Policy超越GPT-5.2-High
- 新增 7 个概念页: [[autoharness]], [[code-as-harness]], [[harness-as-action-verifier]], [[harness-as-policy]], [[thompson-sampling-code-search]], [[iterative-code-refinement]], [[action-applicability]]
- 来源: https://arxiv.org/abs/2603.03329
- 作者: Xinghua Lou, Miguel Lázaro-Gredilla, Antoine Dedieu, Carter Wendelken, Wolfgang Lehrach, Kevin P. Murphy (Google DeepMind)
## [2026-05-29] ingest | 分布式Agent缓存同步 (微信公众号)
- 添加文章 [[distributed-agent-cache-sync-2026]]: "分布式Agent缓存同步" — 多机分布式Prompt Caching架构的工业级工程实践量化交易场景
- 新增 10 个概念页: [[distributed-prompt-caching]], [[cache-cold-start]], [[global-context-hash-tree]], [[active-cache-warmup]], [[shadow-calling]], [[distributed-optimistic-locking]], [[bypass-network-handle-distribution]], [[context-pruning]], [[trading-lifecycle-driven-eviction]], [[distributed-cache-routing]]
- 来源: https://mp.weixin.qq.com/s/MUWV7eug14bktUMlqsxfQw
- 类型: 微信公众号技术文章 (LLM + 量化交易系列)
## [2026-05-29] ingest | Token Superposition Training (arXiv:2605.06546, arXiv cs.CL 2026)
- 添加论文 [[peng-tst-2026]]: "Efficient Pre-Training with Token Superposition" — TST 两阶段预训练方法,等 loss 下 2.5× 训练加速
- 新增 7 个概念页: [[token-superposition-training]], [[multi-hot-cross-entropy]], [[input-superposition]], [[two-phase-pretraining]], [[representation-alignment]], [[coarse-to-fine-granularity]], [[throughput-hypothesis]]
- 来源: https://arxiv.org/abs/2605.06546
- 作者: Bowen Peng, Théo Gigant, Jeffrey Quesnelle (Nous Research)
## [2026-05-26] ingest | The Bayesian Geometry of Transformer Attention (arXiv:2512.22471, 2026)
- 添加论文 [[agarwal-bayesian-attention-geometry]]: "The Bayesian Geometry of Transformer Attention" — Bayesian Attention Trilogy Paper I
- 新增 8 个概念页: [[bayesian-wind-tunnels]], [[inference-primitives]], [[belief-accumulation]], [[belief-transport]], [[random-access-binding]], [[primitive-completeness]], [[bayesian-attention-geometry]], [[bayesian-attention-trilogy]]
- 来源: https://arxiv.org/abs/2512.22471
## [2026-05-26] ingest | 时序预测增强方法综述TPS (WeChat Article, DeepHub/数据派THU, 2026)
- 添加文章 [[temporal-patch-shuffle-tps]]: "时序预测增强方法综述:从频域到 TPS" — 涵盖频域/时频域/分解/Patch 四类方法
- 新增 8 个概念页: [[temporal-patch-shuffle]], [[time-series-forecasting-augmentation]], [[data-label-consistency]], [[freqmask-freqmix]], [[wavemask-wavemix]], [[dominant-shuffle]], [[staug]], [[forecasting-augmentation-taxonomy]]
- 来源: https://mp.weixin.qq.com/s/hPvx3OflUva1olME9F8FoA
## [2026-05-26] ingest | 从零搭建 Mini Agent Harness (WeChat Article, 2026)
- 添加文章 [[mini-agent-harness]]: "从零搭建 Mini Agent Harness" — 陈思州/Datawhale
- 新增 8 个概念页: [[agent-harness-mini]], [[agent-eval-trace]], [[agent-eval-grader]], [[agent-eval-case-design]], [[agent-computer-interface]], [[terminal-bench]], [[anthropic-agent-evals]], [[swe-bench]]
- 来源: https://mp.weixin.qq.com/s/yVFQej3dFk9KHv6J2u6Lew
## [2026-05-23] ingest | Generative Recursive Reasoning (GRAM) (arXiv:2605.19376, 2026)
- 添加论文 [[gram-generative-recursive-reasoning-paper]]: "Generative Recursive Reasoning" — 将确定性递归推理升级为概率性多轨迹计算Baek, Jo, Kim, Ren, Bengio, Ahn; KAIST/Mila/NYU/UdeM
- 新增 11 个概念页: [[recursive-reasoning-models]], [[gram-generative-recursive-reasoning]], [[stochastic-latent-trajectory]], [[multi-trajectory-inference]], [[inference-time-scaling]], [[width-based-scaling]], [[latent-variable-generative-model]], [[amortized-variational-inference]], [[deep-and-wide-reasoning]], [[multi-solution-recovery]], [[unconditional-generation-latent]]
- 来源: https://arxiv.org/abs/2605.19376
- Wiki 规模: 406 → 418 页
## [2026-05-23] ingest | Claw-Eval面向自主Agent的端到端评测框架ModelScope
- 添加文章 [[claw-eval]]: "Claw-Eval" — 300 人工验证任务、Completition/Safety/Robustness 三维护评分、14 前沿模型评测
- 新增 9 个概念页: [[agent-evaluation-paradigm-shift]], [[agent-process-evaluation]], [[pass-at-k-vs-pass-k]], [[agent-safety-evaluation]], [[agent-robustness-evaluation]], [[agent-capability-stability-gap]], [[question-quality-vs-quantity]], [[agent-multidimensional-capability]], [[agent-completion-evaluation]]
- 来源: https://mp.weixin.qq.com/s/4oY35c9SmweJ4Vi0KztVOA (ModelScope 公众号)
- Wiki 规模: 396 → 406 页
## [2026-05-23] ingest | Agent Harness Engineering: A Survey (TMLR 2026, under review)
- 添加论文 [[agent-harness-engineering-survey]]: "Agent Harness Engineering: A Survey" — 提出 ETCLOVG 七层分类法、170+ 开源项目映射、跨层综合与五大开放问题
- 新增 21 个概念页: [[agent-harness-engineering]], [[etclovg-taxonomy]], [[execution-environment]], [[tool-interface]], [[context-management]], [[lifecycle-orchestration]], [[observability]], [[verification-evaluation]], [[governance-security]], [[cost-quality-speed-trilemma]], [[capability-control-tradeoff]], [[harness-coupling-problem]], [[binding-constraint-thesis]], [[prompt-to-harness-evolution]], [[trace-native-evaluation]], [[standard-agent-handoffs]], [[adaptive-harness-simplification]], [[hardening-execution-environments]], [[reliable-state-long-running-agents]], [[context-state-estimation]], [[agent-frameworks-to-platforms]]
- 来源: 用户上传 PDF用户 o9cq80wQvcn_qxHaHlEso2Bn3qoU@im.wechat
- Wiki 规模: 373 → 395 页
## [2026-05-21] ingest | KORE (arXiv:2510.19316, ICML 2026)
- 添加论文 [[kore-knowledge-injection]]: "KORE: Enhancing Knowledge Injection via Knowledge-Oriented Controls" — 知识导向控制协同方法,零空间投影+知识树实现适应与保留双优
- 新增 6 个概念页: [[kore-augmentation]], [[kore-constraint]], [[knowledge-tree]], [[null-space-projection-knowledge]], [[covariance-matrix-knowledge]], [[hars]]
- KORE 是 MMEVOKE 系列工作的解决方案论文,同一作者团队
- 来源: https://arxiv.org/abs/2510.19316
## [2026-05-21] ingest | When Large Multimodal Models Confront Evolving Knowledge (arXiv:2505.24449, ICLR 2026)
- 添加论文 [[when-large-multimodal-models-confront-evolving-knowledge]]: "When Large Multimodal Models Confront Evolving Knowledge" — 多模态进化知识注入首个基准MMEVOKE揭示双重挑战并探索知识增强与保留方案
- 新增 12 个概念页: [[evolving-knowledge-injection]], [[mme-voke]], [[knowledge-aware-augmentation]], [[knowledge-agnostic-augmentation]], [[capability-degradation]], [[data-replay]], [[moe-lora]], [[multimodal-rag]], [[knowledge-retention]], [[knowledge-adaptation]], [[self-evolving-benchmark]], [[sufficient-context-paradox]]
- 来源: https://arxiv.org/abs/2505.24449
## 2026-05-15 — ingest | Continuous Thought Machines (arXiv:2505.05522, NeurIPS 2025)
- 添加论文 [[darlow-ctm-2025]]: "Continuous Thought Machines" — 以神经同步为表示的新型架构NLMs + Neural Synchronization 两大创新
- 新增 11 个概念页: [[continuous-thought-machine]], [[neuron-level-models]], [[neural-synchronization]], [[internal-ticks]], [[synapse-model]], [[certainty-based-loss]], [[adaptive-computation-time]], [[internal-world-model]], [[neuron-pairing]], [[temporal-decay-neural]], [[pre-activation-history]]
@@ -41,6 +156,12 @@
## [2026-05-18] ingest | Pre-train Space Reinforcement Learning (arXiv:2604.14142, 2026)
- 添加论文 [[pre-train-space-reinforcement-learning]]: "Pre-train Space RL" — 在预训练空间中应用 RLNSR-PreRL 剪枝错误路径并激发内生推理DSRL 全面超越 GRPO
- 新增 11 个概念页: [[pre-train-space-reinforcement-learning|PreRL]], [[dual-space-rl|DSRL]], [[post-train-space-rl]], [[negative-sample-reinforcement|NSR]], [[positive-sample-reinforcement|PSR]], [[gradient-alignment]], [[policy-reincarnation]], [[endogenous-reasoning]], [[shared-parameter-influence]], [[distribution-shift]], [[on-policy-learning-collapse]]
- 来源: https://arxiv.org/abs/2604.14142
## [2026-05-14] ingest | StreamingLLM: 基于注意力汇的高效流式语言模型 (arXiv:2309.17453, ICLR 2024)
- 添加论文 [[streaming-llm]]: "Efficient Streaming Language Models with Attention Sinks" — 发现 Attention Sink 现象,提出无需微调的无限长流式推理框架
- 新增 5 个概念页: [[length-extrapolation]], [[rolling-kv-cache]], [[sink-token]], [[softmax-off-by-one]], [[window-attention]]