Files
myWiki/index.md

938 lines
64 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# LLM Wiki
> 知识索引页面 — 自动生成
> 最后更新2026-06-17 | 总页面数914
## Concepts
- [[4d-gaussian-splatting]] — 4D 高斯泼溅 (4D Gaussian Splatting)
- [[abductive-reasoning-recommendation]] — 溯因推理 (推荐) — Abductive Reasoning in Recommendation
- [[absolute-gating]] — 绝对门控与相对门控 (Absolute vs Relative Gating)
- [[abstract-representation-space]] — 抽象表征空间 (Abstract Representation Space)
- [[action-applicability]] — Action Applicability (动作合法性判定)
- [[action-consequence-prediction]] — 预测行动后果 (Action Consequence Prediction)
- [[action-decoder]] — 动作解码器 (Action Decoder)
- [[action-head-router]] — 动作头路由器 (Action Head Router)
- [[action-realization-layer]] — Action Realization Layer动作实现层
- [[action-routing-policy]] — 动作路由策略 (Action-Routing Policy)
- [[activation-manifold]] — Activation Manifold
- [[activation-steering]] — Activation Steering
- [[active-cache-warmup]] — Active Cache Warm-up (主动缓存预热)
- [[adapter-protocol]] — 适配器协议 (Adapter Protocol)
- [[adaptive-adversary]] — 自适应对手 (Adaptive Adversary)
- [[adaptive-computation-time]] — Adaptive Computation Time (ACT)
- [[adaptive-harness-simplification]] — Adaptive Harness Simplification自适应 Harness 简化)
- [[additive-combinatorics]] — Additive Combinatorics加法组合学
- [[agent-capability-stability-gap]] — Agent Capability-Stability Gap能力-稳定性差距)
- [[agent-communication-stack]] — Agent通信协议栈
- [[agent-completion-evaluation]] — Agent Completion EvaluationAgent 完成度评测)
- [[agent-computer-interface]] — Agent-Computer Interface (ACI)
- [[agent-eval-case-design]] — Agent Eval Case Design
- [[agent-eval-grader]] — Agent Eval Grader
- [[agent-eval-trace]] — Agent Eval Trace
- [[agent-evaluation-paradigm-shift]] — Agent 评测范式转变Paradigm Shift in Agent Evaluation
- [[agent-frameworks-to-platforms]] — Agent Frameworks to Platforms从 Agent 框架到 Agent 平台)
- [[agent-governance]] — Agent GovernanceAgent 治理与安全)
- [[agent-harness]] — Agent Harness (Claw)
- [[agent-harness-engineering]] — Agent Harness EngineeringAgent 执行骨架工程)
- [[agent-harness-mini]] — Mini Agent Harness
- [[agent-harness-safety]] — Agent Harness Safety
- [[agent-mediated-deception]] — 代理中介欺骗 (Agent-Mediated Deception)
- [[agent-multidimensional-capability]] — Agent Multidimensional CapabilityAgent 多维能力)
- [[agent-network-memory-scope]] — Agent网络记忆范围
- [[agent-network-taxonomy]] — Agent网络三层分类法
- [[agent-network-topology]] — Agent网络拓扑
- [[agent-network-update-behavior]] — Agent网络更新行为
- [[agent-observability]] — Agent ObservabilityAgent 可观测性)
- [[agent-process-evaluation]] — Agent Process Evaluation过程评测
- [[agent-robustness-evaluation]] — Agent Robustness EvaluationAgent 鲁棒性评测)
- [[agent-safety-evaluation]] — Agent Safety EvaluationAgent 安全评测)
- [[agent-sandbox]] — Agent SandboxAgent 沙箱)
- [[agent-symbolic-learning]] — Agent Symbolic Learning (Agent 符号学习)
- [[agent-token-budget-optimization]] — Agent Token Budget Optimization
- [[agent-verification]] — Agent VerificationAgent 验证与评估)
- [[agentic-systems]] — Agentic Systems智能体系统
- [[ai-agent-security]] — AI代理安全
- [[ai-alignment]] — AI Alignment (AI对齐)
- [[ai-mathematics]] — AI and Mathematics (AI 与数学)
- [[ai-safety]] — AI Safety (AI安全)
- [[aleatoric-uncertainty]] — 随机不确定性 (Aleatoric Uncertainty)
- [[algebraic-numbers-countability]] — 代数数的可数性
- [[algorithmic-equity]] — 算法公平性 (Algorithmic Equity)
- [[amortized-variational-inference]] — Amortized Variational Inference摊销变分推断
- [[analytical-report-synthesizer]] — Analytical Report Synthesizer
- [[and-or-interactions]] — AND-OR 交互 (AND-OR Interactions)
- [[anthropic-agent-evals]] — Anthropic Agent Evals
- [[api-key-authentication]] — API Key 认证 (API Key Authentication)
- [[arxiv]] — arXiv
- [[asynchronous-rl-llm]] — 异步强化学习与大语言模型后训练
- [[attention-entropy-collapse]] — 注意力熵崩溃 (Attention Entropy Collapse)
- [[attention-sinks]] — 注意力汇 (Attention Sinks)
- [[autoharness]] — AutoHarness
- [[automated-theorem-proving]] — 自动定理证明 (Automated Theorem Proving, ATP)
- [[automatic-prompt-optimization]] — APO 自动提示工程 (Automatic Prompt Optimization)
- [[auxiliary-predictive-objectives]] — 辅助预测目标 (Auxiliary Predictive Objectives)
- [[backtranslation-round-trip-relay]] — Backtranslation Round-Trip Relay
- [[banach-space]] — Banach 空间 (Banach Space)
- [[bare-adapter]] — Bare Adapter
- [[base-table-embedding]] — Base Table Embedding
- [[bastiani-calculus]] — Bastiani 微积分 (Bastiani Calculus)
- [[bayesian-attention-geometry]] — Bayesian Attention Geometry (贝叶斯注意力几何)
- [[bayesian-attention-trilogy]] — Bayesian Attention Trilogy
- [[bayesian-deep-learning]] — 贝叶斯深度学习 (Bayesian Deep Learning)
- [[bayesian-nonparametric-tpp]] — 贝叶斯非参数 TPP (Bayesian Nonparametric TPP)
- [[bayesian-wind-tunnels]] — Bayesian Wind Tunnels
- [[belief-accumulation]] — Belief Accumulation (信念累积)
- [[belief-transport]] — Belief Transport (信念传输)
- [[bellman-taylor-score-decoding]] — Bellman-Taylor 得分解码 (BTSD)
- [[bidirectional-trajectory-evaluation]] — 双向轨迹评估 (Bidirectional Trajectory Evaluation)
- [[binding-constraint-thesis]] — Binding-Constraint Thesis约束瓶颈论
- [[block-sparse-attention]] — Block-Sparse Attention Mask (分块稀疏注意力掩码)
- [[boundary-compliance]] — Boundary Compliance
- [[bounded-reuse]] — 有界复用 (Bounded Reuse)
- [[bpf-syscall-interception]] — BPF系统调用拦截
- [[btsd-ppo]] — BTSD-PPO
- [[bypass-network-handle-distribution]] — Bypass Network Handle Distribution (旁路网络句柄分发)
- [[cache-cold-start]] — Cache Cold-Start (缓存冷启动)
- [[cache-health-observability]] — Cache Health Observability缓存健康度可观测性
- [[cache-hit-ratio]] — Cache Hit Ratio (CHR)
- [[cache-invalidation]] — Cache Invalidation缓存失效
- [[cache-safe-forking]] — Cache-Safe Forking缓存安全分叉
- [[caddy-web-server]] — Caddy Web Server
- [[capability-control-tradeoff]] — Capability-Control Tradeoff能力-控制权衡)
- [[capability-degradation]] — 能力退化 (Capability Degradation)
- [[catastrophic-forgetting]] — 灾难性遗忘 (Catastrophic Forgetting)
- [[causal-decomposition-pomg]] — 因果分解 (Causal Decomposition in POMG)
- [[causal-information-flow]] — Causal Information Flow
- [[cel-shading-style]] — 赛璐璐风格 (Cel-Shading)
- [[centralized-agent-architecture]] — 集中式Agent架构
- [[certainty-based-loss]] — Certainty-Based Loss
- [[certainty-based-rewards]] — 确定性奖励 (Certainty-Based Rewards)
- [[chain-of-thought]] — 思维链 (Chain-of-Thought, CoT)
- [[chaitin-algorithmic-information-theory]] — 算法信息论 (Algorithmic Information Theory, AIT)
- [[chaitin-constant]] — 蔡廷常数 Ω (Chaitin's Constant)
- [[cl-bench-life]] — CL-Bench Life
- [[classifier-free-guidance-language]] — Classifier-Free Guidance for Language
- [[claw-swe-bench-lite]] — Claw-SWE-Bench Lite
- [[clawless]] — ClawLess
- [[clean-conditioning-mask]] — 清洁条件掩码 (Clean-Conditioning Mask)
- [[clinical-ai]] — 临床人工智能 (Clinical AI)
- [[coarse-grained-counting]] — 粗粒度计数 (Coarse-grained Counting)
- [[coarse-to-fine-granularity]] — Coarse-to-Fine Granularity
- [[coconut]] — COCONUT: 连续潜空间推理
- [[code-as-harness]] — Code as Harness
- [[cognitive-architecture]] — Cognitive Architecture (认知架构)
- [[compiled-ai-paradigm]] — Compiled AI Paradigm (编译型 AI 范式)
- [[completeness-logic]] — 完备性 (Completeness, 逻辑学)
- [[composable-base-model-architecture]] — Composable Base Model Architecture
- [[compressed-sparse-attention]] — Compressed Sparse Attention (CSA)
- [[computability-theory]] — 可计算性理论 (Computability Theory)
- [[computer-use-agents]] — Computer Use Agents (CUAs)
- [[computerized-adaptive-testing]] — Computerized Adaptive Testing (CAT)
- [[concept-lattice]] — 概念格 (Concept Lattice)
- [[concept-learning]] — 概念学习:几何视角 (Concept Learning: Geometric View)
- [[conditional-intensity-function]] — 条件强度函数 (Conditional Intensity Function)
- [[conditional-model-dispatcher]] — Conditional Model Dispatcher
- [[confidence-correctness-alignment]] — 置信度-正确性对齐 (Confidence-Correctness Alignment)
- [[consistency-logic]] — 一致性 (Consistency, 逻辑学)
- [[content-grounded-retrieval]] — Content-Grounded Retrieval — Faithfulness as First Principle
- [[content-question-answering]] — Content Question Answering (CQA)
- [[context-blue-clique]] — Context Blue Clique上下文蓝色团
- [[context-compression]] — Context Compression上下文压缩
- [[context-drift]] — Context Drift上下文漂移
- [[context-engineering]] — Context Engineering上下文工程
- [[context-learning]] — 上下文学习 (Context Learning)
- [[context-management]] — Context Management上下文管理
- [[context-misuse]] — 上下文误用 (Context Misuse)
- [[context-pruning]] — Context Pruning (上下文剪枝)
- [[context-state-estimation]] — Context as State Estimation上下文作为状态估计
- [[continual-learning]] — 持续学习 (Continual Learning)
- [[continuation-value-function]] — 延续价值函数 (Continuation Value Function)
- [[continuous-diffusion-language-models]] — Continuous Diffusion Language Models
- [[continuous-representation]] — 连续表征 (Continuous Representation)
- [[continuous-thought-machine]] — Continuous Thought Machine (CTM)
- [[continuous-time-rl]] — 连续时间强化学习 (Continuous-Time RL)
- [[continuum-hypothesis]] — 连续统假设 (Continuum Hypothesis, CH)
- [[control-affine-mdp]] — 控制仿射 MDP (Control-Affine MDP)
- [[controlled-autonomy]] — Controlled Autonomy (受控的自主性)
- [[controlled-text-generation]] — Controlled Text Generation
- [[cost-aware-benchmarking]] — 代价感知基准评测 (Cost-Aware Benchmarking)
- [[cost-quality-speed-trilemma]] — Cost-Quality-Speed Trilemma成本-质量-速度三元悖论)
- [[countable-uncountable-infinity]] — 可数与不可数无穷
- [[covariance-matrix]] — 协方差矩阵 (Covariance Matrix)
- [[covariance-matrix-knowledge]] — 协方差矩阵知识存储 (Covariance Matrix Knowledge Storage)
- [[cramer-rao-lower-bound]] — Cramér-Rao Lower Bound (CRLB)
- [[crawl4ai]] — Crawl4AI
- [[critical-failures]] — Critical Failures / 关键失败
- [[critpt]] — CritPt (Critical Point Benchmark)
- [[cross-model-harness-transfer]] — Cross-Model Harness Transfer跨模型 Harness 迁移)
- [[cross-section-synthesis]] — Cross-Section Synthesis — Information Integration Across Document Parts
- [[curvine-distributed-cache]] — Curvine 云原生分布式缓存
- [[darwin-godel-machine]] — Darwin Gödel Machine (达尔文·哥德尔机)
- [[data-augmentation]] — 数据增强 (Data Augmentation)
- [[data-hierarchical-governance]] — Data Hierarchical Governance (L0-L4 数据分级治理)
- [[data-label-consistency]] — Data-Label Consistency (数据-标签一致性)
- [[data-quality-over-scale]] — Data Quality over Scale (数据质量重于规模)
- [[data-replay]] — 数据回放 (Data Replay)
- [[data-slice]] — Data Slice
- [[data-wall]] — 数据墙 (Data Wall)
- [[ddcadam]] — DDCAdam (Dead-Direction-Calibrated Adam)
- [[dead-direction]] — 死方向 (Dead Direction)
- [[decentralized-agent-architecture]] — 去中心化Agent架构
- [[deep-and-wide-reasoning]] — Deep-and-Wide Reasoning深度且宽广的推理
- [[deep-gaussian-process]] — 深度高斯过程 (Deep Gaussian Process)
- [[deep-rl-scaling]] — 扩展深度强化学习 (Scaling Deep RL)
- [[deep-thinking-sft]] — Deep-Thinking SFT (深思考SFT数据)
- [[deep-variational-implicit-process]] — 深度变分隐式过程 (DVIP)
- [[deepseek-r1]] — DeepSeek-R1
- [[deepseek-v4-flash]] — DeepSeek-V4-Flash
- [[deepseek-vit]] — DeepSeek-ViT
- [[delegate-52]] — DELEGATE-52
- [[delegated-work]] — Delegated Work / 委托工作
- [[depth-scaling-signal-degradation]] — LLM 深度扩展与信号退化
- [[deterministic-agent-failures]] — Deterministic Agent Failures确定性 Agent 失败分类)
- [[dgae]] — Difficulty-Balanced Group Advantage Estimation (DGAE)
- [[dgpo]] — Difficulty-Aware Group Policy Optimization (DGPO)
- [[diagonal-ramsey-number]] — Diagonal Ramsey Number对角拉姆齐数
- [[diagonalization-method]] — 对角线方法 (Diagonalization Method)
- [[differentiable-token-budgeting]] — Differentiable Token Budgeting
- [[diffusion-based-tpp]] — 扩散时间点过程 (Diffusion-based TPP)
- [[dime-dynamic-in-database-modeling-engine]] — DIME (Dynamic In-Database Modeling Engine)
- [[discrete-diffusion-language-models]] — discrete-diffusion-language-models
- [[distractor-context]] — Distractor Context / 干扰上下文
- [[distributed-cache-routing]] — Distributed Cache Routing (分布式缓存路由)
- [[distributed-optimistic-locking]] — Distributed Optimistic Locking (分布式乐观锁)
- [[distributed-prompt-caching]] — Distributed Prompt Caching (分布式提示词缓存)
- [[distribution-shift]] — Distribution Shift分布偏移
- [[document-degradation]] — Document Degradation / 文档退化
- [[domain-knowledge-reasoning]] — 领域知识推理 (Domain Knowledge Reasoning)
- [[domain-specific-evaluation]] — Domain-Specific Evaluation / 领域特定评估
- [[dominant-shuffle]] — Dominant Shuffle
- [[double-descent]] — 双下降 (Double Descent)
- [[dpo]] — DPO (Direct Preference Optimization)
- [[dqw]] — Difficulty-Aware Question-Level Weighting (DQW)
- [[drift-detection]] — 漂移检测 (Drift Detection)
- [[dual-layer-rl]] — Dual-Layer RL (双层强化学习)
- [[dual-space-rl]] — Dual Space RL (DSRL)
- [[duo-attention]] — DuoAttention
- [[dynamic-in-database-modeling]] — Dynamic In-Database Modeling
- [[dynamic-mode-decomposition]] — Dynamic Mode Decomposition (DMD)
- [[dynamic-model-fusion]] — Dynamic Model Fusion
- [[dynamic-relation-modeling]] — Dynamic Relation Modeling
- [[dynamic-weight-updates]] — Dynamic Weight Updates
- [[eluder-dimension]] — Eluder 维度 (Eluder Dimension)
- [[embedded-language-flows]] — Embedded Language Flows (ELF)
- [[eml-operator]] — EML 算子 (Exp-Minus-Log)
- [[emmy-noether]] — 埃米·诺特 (Emmy Noether)
- [[emotional-value-evaluation]] — 情绪价值评估 (Emotional Value Evaluation)
- [[empirical-discovery-simulation]] — 经验发现与模拟 (Empirical Discovery & Simulation)
- [[endogenous-reasoning]] — Endogenous Reasoning内生推理
- [[ensemble-based-rewards]] — 集成奖励 (Ensemble-Based Rewards)
- [[environment-contract-layer]] — Environment Contract Layer环境契约层
- [[epistemic-uncertainty]] — 认知不确定性 (Epistemic Uncertainty)
- [[epoch-based-optimistic-mle]] — Epoch-based 乐观 MLE (Epoch-based Optimistic MLE)
- [[etclovg-taxonomy]] — ETCLOVG 七层分类法
- [[evolution-probe]] — 进化探针 (Evolution Probe)
- [[evolutionary-algorithms]] — Evolutionary Algorithms (进化算法)
- [[evolving-knowledge-injection]] — 进化知识注入 (Evolving Knowledge Injection)
- [[execution-environment]] — Execution Environment执行环境与沙箱
- [[execution-fidelity]] — Execution Fidelity
- [[execution-harness]] — Execution Harness
- [[expected-calibration-error]] — 预期校准误差 (Expected Calibration Error, ECE)
- [[experience-distillation]] — 经验蒸馏 (Experience Distillation)
- [[experience-representation]] — 经验表示 (Experience Representation)
- [[exploratory-dynamics]] — 探索动力学 (Exploratory Dynamics)
- [[exponential-decay-reward]] — 指数衰减奖励 (Exponential Decay Reward)
- [[fading-memory]] — 衰减记忆 (Fading Memory)
- [[faithfulness-in-ai]] — Faithfulness in AI
- [[feature-absorption]] — 特征吸收 (Feature Absorption)
- [[feature-family]] — 特征家族 (Feature Family)
- [[feature-splitting]] — 特征分裂 (Feature Splitting)
- [[few-shot-learning]] — Few-Shot Learning (少样本学习)
- [[fiber-of-parametrization]] — 参数化纤维 (Fiber of Parametrization)
- [[fine-grained-counting]] — 细粒度计数 (Fine-grained Counting)
- [[fisher-information-metric]] — Fisher 信息度量 (Fisher Information Metric)
- [[five-axis-positional-encoding]] — 五轴位置编码 (Five-Axis Positional Encoding)
- [[fixed-mean-gaussian-process]] — 固定均值高斯过程 (Fixed-Mean Gaussian Process)
- [[flash-attention]] — FlashAttention
- [[flash-attention-3]] — FlashAttention-3
- [[flex-attention]] — FlexAttention
- [[flow-matching]] — Flow Matching
- [[forecasting-augmentation-taxonomy]] — Forecasting Augmentation Taxonomy
- [[formal-concept-analysis]] — 形式概念分析 (Formal Concept Analysis)
- [[formal-security-model]] — 形式化安全模型
- [[formal-systems]] — 形式系统 (Formal System)
- [[formal-verification]] — Formal Verification (形式化验证)
- [[forward-authentication]] — 外部认证委托 (Forward Authentication)
- [[fourier-filter-dynamics]] — Fourier Filter for DynamicsFourier Filter 动力学分解)
- [[fp4-quantization-training]] — FP4 Quantization-Aware Training
- [[freetimegs]] — FreeTimeGS
- [[freqmask-freqmix]] — FreqMask / FreqMix
- [[function-space-modeling]] — 函数空间建模 (Function-Space Modeling)
- [[functional-input-neural-networks]] — 函数输入神经网络 (Functional Input Neural Network)
- [[furstenberg-correspondence]] — Furstenberg Correspondence Principle
- [[future-commit-cleanup]] — Future-Commit 清理 (Future-Commit Cleanup)
- [[gaussian-process]] — 高斯过程 (Gaussian Process)
- [[gene-bench]] — Gene-Bench
- [[gene-evolution-protocol]] — 基因进化协议 (GEP)
- [[gene-probe]] — 基因探针 (Gene Probe)
- [[generalization-bounds]] — 泛化界 (Generalization Bounds)
- [[generation-verification-asymmetry]] — 生成-验证不对称性 (Generation-Verification Asymmetry)
- [[generative-general-unification]] — Generative-General-Unification (GenAI 三支柱)
- [[generative-perplexity]] — generative-perplexity
- [[generative-recommendation]] — 生成式推荐 (Generative Recommendation)
- [[genetic-programming]] — Genetic Programming (遗传编程)
- [[geometric-ramsey-theory]] — Geometric Ramsey Theory几何拉姆齐理论
- [[georg-cantor]] — 格奥尔格·康托尔 (Georg Cantor)
- [[gflownet-fine-tuning]] — GFlowNet 微调
- [[glitch-art-style]] — 故障艺术 (Glitch Art)
- [[global-context-hash-tree]] — Global Context Hash Tree (全局上下文哈希树)
- [[godel-incompleteness-theorems]] — 哥德尔不完备定理 (Gödel's Incompleteness Theorems)
- [[godel-numbering]] — 哥德尔编码 (Gödel Numbering)
- [[goodsteins-theorem]] — 古德斯坦定理 (Goodstein's Theorem)
- [[governance-security]] — Governance & Security治理与安全
- [[gpt-image2]] — GPT-Image-2
- [[gradient-alignment]] — Gradient Alignment (PreRL)
- [[gram-generative-recursive-reasoning]] — GRAMGenerative Recursive reAsoning Models
- [[granger-causality-tpp]] — Granger 因果发现 (Granger Causality in TPP)
- [[gravitino-unified-metadata]] — Gravitino 统一元数据管理
- [[greedy-context-screening]] — Greedy Context Screening贪心上下文筛选
- [[green-tao-theorem]] — Green-Tao Theorem
- [[group-relative-policy-optimization]] — 群体相对策略优化 (GRPO)
- [[grouped-query-attention]] — Grouped-Query Attention (GQA)
- [[grpo]] — Group Relative Policy Optimization (GRPO)
- [[gui-tool-hybrid-action-space]] — GUI-Tool Hybrid Action Space
- [[gumbel-softmax]] — Gumbel-Softmax 重参数化
- [[halftone-print-style]] — 半调印刷风格 (Halftone Print Style)
- [[hallucination-mitigation]] — Hallucination Mitigation in LLM Systems
- [[halting-problem]] — 停机问题 (Halting Problem)
- [[hard-token]] — Hard Token
- [[hardening-execution-environments]] — Hardening Execution Environments硬化执行环境
- [[harness-as-action-verifier]] — Harness-as-Action-Verifier
- [[harness-as-policy]] — Harness-as-Policy (Code as Policy)
- [[harness-coupling-problem]] — Harness Coupling ProblemHarness 耦合问题)
- [[harness-engineering]] — Harness Engineering
- [[harness-evolution]] — Harness Evolution轨迹驱动的 Harness 进化)
- [[harness-model-interaction]] — Harness × Model 交互效应
- [[harnessaudit]] — HarnessAudit
- [[hars]] — HARS调和适应保留评分
- [[hawkes-process]] — Hawkes 过程 (Hawkes Process)
- [[heavily-compressed-attention]] — Heavily Compressed Attention (HCA)
- [[held-out-validation-gate]] — Held-Out Validation Gate (留出验证门)
- [[heuristic-learning]] — Heuristic Learning (启发式学习)
- [[hidden-audit-channel]] — Hidden Audit Channel
- [[hidden-symmetries-neural]] — 隐藏对称性 (Hidden Symmetries)
- [[hierarchy-preservation]] — Hierarchy Preservation — Structural Knowledge for Literature Ranking
- [[hilberts-program]] — 希尔伯特计划 (Hilbert's Program)
- [[honest-open-subset]] — Honest 开子集 (Honest Open Subset)
- [[hrpo]] — HRPO: Hybrid Reasoning Policy Optimization
- [[human-agent-trust]] — 人机信任 (Human-Agent Trust)
- [[human-centered-ai]] — Human-Centered AI (以人类为中心的 AI)
- [[hybrid-attention-architecture]] — Hybrid Attention Architecture
- [[hybrid-reasoning]] — 混合推理 (Hybrid Reasoning)
- [[hyperagents]] — Hyperagents (超智能体)
- [[hypergraph-ramsey-number]] — Hypergraph Ramsey Number超图拉姆齐数
- [[hyperplane-arrangements]] — 超平面排列 (Hyperplane Arrangements)
- [[identity-reference-resolution]] — 身份指代消解 (Identity Reference Resolution)
- [[image-generation-prompt-design]] — 图像生成 Prompt 设计
- [[implicit-processes]] — 隐式过程 (Implicit Processes)
- [[in-context-learning]] — 上下文学习 (In-Context Learning)
- [[in-database-analytics]] — In-Database Analytics
- [[inference-primitives]] — Inference Primitives (推理原语)
- [[inference-time-scaling]] — Inference-Time Scaling推理时扩展
- [[infinite-dimensional-manifolds]] — 无限维流形 (Infinite-Dimensional Manifolds)
- [[infinite-width-limit]] — 无限宽度极限 (Infinite-Width Limit)
- [[infinity-hierarchy]] — 无穷层级体系 (Infinity Hierarchy)
- [[information-flow-control]] — Information Flow Control
- [[information-geometry]] — 信息几何 (Information Geometry)
- [[input-superposition]] — Input Superposition
- [[intensity-free-modeling]] — Intensity-free 建模
- [[interaction-based-explanation]] — 交互基解释 (Interaction-Based Explanation)
- [[interaction-generalizability]] — 交互泛化性 (Interaction Generalizability)
- [[interaction-order]] — 交互阶数 (Interaction Order)
- [[interaction-types-sft]] — SFT 中的三类交互 (Removed, Preserved, Newly Emerged)
- [[interleaved-gui-tool-trajectory-scaling]] — Interleaved GUI-Tool Trajectory Scaling Pipeline
- [[internal-ticks]] — Internal Ticks
- [[internal-world-model]] — Internal World Model
- [[intervention-multiplier]] — Intervention Multiplier
- [[intrabench]] — IntraBench — Benchmark for Content-Grounded Literature QA
- [[intragent]] — IntrAgent — Structural-Aware Literature Reading Agent
- [[intraview]] — IntraView — Content-Grounded Literature Information Retrieval
- [[intrinsic-rewards-sharpening]] — 内在奖励锐化机制 (Intrinsic Rewards Sharpening)
- [[itemic-text-alignment]] — Itemic-Text 对齐 (Itemic-Text Alignment)
- [[itemic-tokens]] — Itemic Token
- [[iterative-code-refinement]] — Iterative Code Refinement (迭代代码精炼)
- [[iterative-reading]] — Iterative Reading — Progressive Information Extraction from Literature
- [[ito-calculus]] — Itô 微积分 (Itô Calculus)
- [[jagged-frontier]] — Jagged Frontier / 锯齿前沿
- [[jepa]] — JEPA (Joint Embedding Predictive Architecture)
- [[k-pass-training]] — K-Pass Training (K 遍训练)
- [[kl-order]] — KL 阶 (KL Order)
- [[klein-blue]] — 克莱因蓝 (Klein Blue / IKB)
- [[knowledge-adaptation]] — 知识适应 (Knowledge Adaptation)
- [[knowledge-agnostic-augmentation]] — 知识无关增强 (Knowledge-Agnostic Augmentation)
- [[knowledge-aware-augmentation]] — 知识感知增强 (Knowledge-Aware Augmentation)
- [[knowledge-bank]] — Knowledge Bank — AI 辅助开发时代的知识管理系统
- [[knowledge-injection]] — 知识注入 (Knowledge Injection)
- [[knowledge-internalization]] — 知识内化 (Knowledge Internalization)
- [[knowledge-retention]] — 知识保留 (Knowledge Retention)
- [[knowledge-tree]] — 知识树 (Knowledge Tree)
- [[kolmogorov-complexity]] — 柯尔莫哥洛夫复杂度 (Kolmogorov Complexity)
- [[koopman-autoencoder]] — Koopman Autoencoder (KAE)
- [[koopman-predictor]] — Koopman PredictorKoopman 预测器)
- [[koopman-theory]] — Koopman TheoryKoopman 理论)
- [[kore-augmentation]] — KORE-AUGMENTATION知识导向增强
- [[kore-constraint]] — KORE-CONSTRAINT知识导向约束
- [[kv-cache-bottleneck]] — KV 缓存内存瓶颈
- [[kvcache-transfer]] — KVCache 传输与优化
- [[language-gradient]] — Language Gradient (语言梯度)
- [[language-loss]] — Language Loss (语言损失)
- [[latent-reasoning]] — 潜在推理 (Latent Reasoning)
- [[latent-score-mdp]] — 潜在得分 MDP (Latent-Score MDP)
- [[latent-variable-generative-model]] — Latent-Variable Generative Model潜在变量生成模型
- [[length-extrapolation]] — 长度外推 (Length Extrapolation)
- [[leopold-kronecker]] — 利奥波德·克罗内克尔 (Leopold Kronecker)
- [[leworldmodel]] — LeWorldModel
- [[lifecycle-aware-harness]] — Lifecycle-Aware Harness生命周期感知 Harness
- [[lifecycle-orchestration]] — Lifecycle & Orchestration生命周期与编排
- [[linear-attention-methods]] — 线性注意力方法 (Linear Attention Methods)
- [[linear-quadratic-regulator]] — 线性二次调节器 (Linear Quadratic Regulator)
- [[linear-representation-hypothesis]] — Linear Representation Hypothesis
- [[linearized-neural-network]] — 线性化神经网络 (Linearized Neural Network)
- [[llama-factory]] — LLaMA-Factory
- [[llm-applications]] — LLM 应用
- [[llm-based-temporal-point-process]] — LLM 时间点过程 (LLM-based TPP)
- [[llm-evaluation-benchmarks]] — LLM 评测基准体系
- [[logfire]] — Logfire
- [[logical-model-interaction]] — 交互逻辑模型 (Logical Model of Interactions)
- [[long-context-understanding]] — 长上下文理解 (Long-Context Understanding)
- [[long-horizon-evaluation]] — Long-Horizon Evaluation / 长视界评估
- [[lora]] — LoRA (Low-Rank Adaptation)
- [[lost-in-the-middle]] — Lost in the Middle
- [[lovasz-local-lemma]] — Lovász Local Lemma
- [[lucas-penrose-argument]] — 卢卡斯-彭罗斯论证 (Lucas-Penrose Argument)
- [[macro-level-token-economics]] — Macro-Level Token Economics
- [[mamba-ssm]] — Mamba (State Space Model)
- [[manifold-constrained-hyper-connections]] — Manifold-Constrained Hyper-Connections (mHC)
- [[marked-temporal-point-process]] — 标记时间点过程 (Marked TPP)
- [[martingale-clt]] — 鞅中心极限定理 (Martingale CLT)
- [[math-question-reformulation]] — 数学问题多维度改写
- [[mathchatsync-reasoning]] — MathChatSync Reasoning
- [[mathematical-pluralism]] — 数学多元主义 (Mathematical Pluralism)
- [[mathematical-priority-disputes]] — 数学优先权争议
- [[mathforge]] — MathForge 框架
- [[maze-navigation]] — 迷宫导航 (Maze Navigation)
- [[mc-dropout]] — MC Dropout (Monte Carlo Dropout)
- [[mechanistic-interpretability]] — 机制可解释性 (Mechanistic Interpretability)
- [[memory-caching-rnn]] — Memory Caching (MC)
- [[meso-level-token-economics]] — Meso-Level Token Economics
- [[messy-context-reasoning]] — 混乱上下文推理 (Messy Context Reasoning)
- [[meta-jctrader]] — Meta-JCTrader
- [[meta-learning]] — Meta-Learning (元学习)
- [[metacognitive-self-modification]] — Metacognitive Self-Modification (元认知自我修改)
- [[metamathematics]] — 元数学 (Metamathematics)
- [[micro-level-token-economics]] — Micro-Level Token Economics
- [[million-token-context]] — Million-Token Context
- [[mineru]] — minerU — PDF-to-Markdown for Scientific Literature
- [[minimax-optimality]] — Minimax 最优性 (Minimax Optimality)
- [[mixture-of-attention-schemes]] — Mixture of Attention Schemes (MoAS)
- [[mixture-of-depths-attention]] — Mixture-of-Depths Attention (MoDA)
- [[mixture-of-experts]] — Mixture of Experts (MoE)
- [[mme-voke]] — MMEVOKE
- [[model-collapse-step]] — 模型崩溃步 (Model Collapse Step, MCS)
- [[model-free-rl]] — Model-Free 强化学习 (Model-Free RL)
- [[model-harness-relationship]] — Model-Harness Relationship (模型与Harness关系)
- [[model-steering]] — Model Steering
- [[moe-lora]] — MoELoRA
- [[moe-lora-toolchain-conflict]] — MOE + LoRA 工具链冲突
- [[monocular-video-to-4d]] — 单目视频到 4D (Monocular Video to 4D)
- [[mqr]] — Multi-Aspect Question Reformulation (MQR)
- [[mrq-algorithm]] — MR.Q 算法 (MR.Q Algorithm)
- [[multi-agent-orchestration]] — Multi-Agent Orchestration多 Agent 编排)
- [[multi-agent-safety]] — Multi-Agent Safety
- [[multi-dimensional-synthetic-data]] — 多维合成数据 (Multi-Dimensional Synthetic Data)
- [[multi-head-attention]] — Multi-Head Attention (MHA)
- [[multi-head-latent-attention]] — Multi-head Latent Attention (MLA)
- [[multi-hot-cross-entropy]] — Multi-hot Cross-Entropy (MCE)
- [[multi-query-attention]] — Multi-Query Attention (MQA)
- [[multi-solution-recovery]] — Multi-Solution Recovery多解恢复
- [[multi-step-planning]] — 多步规划 (Multi-Step Planning)
- [[multi-teacher-on-policy-distillation]] — Multi-Teacher On-Policy Distillation (MODPO)
- [[multi-token-prediction]] — Multi-Token Prediction (MTP)
- [[multi-trajectory-inference]] — Multi-Trajectory Inference多轨迹推理
- [[multi-turn-reasoning]] — Multi-Turn Reasoning Training (多轮推理训练)
- [[multi-view-captioning]] — 多视角字幕 (Multi-View Captioning)
- [[multimodal-large-language-model]] — 多模态大语言模型 (MLLM)
- [[multimodal-rag]] — 多模态 RAG (Multimodal RAG)
- [[multitask-rl]] — 多任务强化学习 (Multitask RL)
- [[muon-optimizer]] — Muon Optimizer
- [[nachbin-theorem]] — Nachbin 定理
- [[native-sparse-attention]] — Native Sparse Attention (NSA)
- [[negative-sample-reinforcement]] — Negative Sample Reinforcement (NSR)
- [[neural-synchronization]] — Neural Synchronization as Representation
- [[neural-tangent-kernel]] — 神经正切核 (Neural Tangent Kernel)
- [[neural-temporal-point-process]] — 神经时间点过程 (Neural TPP)
- [[neurida]] — NeurIDA
- [[neuroalgebraic-geometry]] — 神经代数几何 (Neuroalgebraic Geometry)
- [[neuromanifold]] — 神经流形 (Neuromanifold)
- [[neuron-level-models]] — Neuron-Level Models (NLMs)
- [[neuron-pairing]] — Neuron Pairing
- [[neuroscience]] — Neuroscience (神经科学)
- [[next-state-grounding]] — Next-State Grounding
- [[non-anticipative-functionals]] — 非预期泛函 (Non-Anticipative Functionals)
- [[non-stationary-time-series]] — Non-stationary Time Series非平稳时间序列
- [[ntk-aware-interpolation]] — NTK-aware 位置编码插值
- [[null-space]] — 零空间 (Null Space)
- [[null-space-projection-knowledge]] — 零空间投影知识保留 (Null Space Projection for Knowledge Retention)
- [[objective-driven-ai]] — 目标驱动AI (Objective-Driven AI)
- [[observability]] — Observability & Operations可观测性与运维
- [[observable-operator-model]] — 可观测算子模型 (Observable Operator Model, OOM)
- [[off-policy-llm-post-training]] — Off-Policy LLM 后训练
- [[on-policy-distillation]] — On-Policy Distillation (OPD)
- [[on-policy-learning-collapse]] — On-policy Learning Collapse
- [[one-pass-fine-tuning]] — One-Pass Fine-Tuning (单遍微调)
- [[onereason-bench]] — OneReason-Bench
- [[onerec]] — OneRec 生成式推荐模型族
- [[open-telemetry]] — OpenTelemetry (OTel)
- [[openclaw]] — OpenClaw
- [[optimal-gui-tool-path-selection]] — Optimal GUI-Tool Path Selection
- [[osworld-mcp]] — OSWorld-MCP Benchmark
- [[output-aware-metric]] — Output-Aware Metric (OAM)
- [[pac-bayesian-bounds]] — PAC-Bayesian 泛化界 (PAC-Bayesian Bounds)
- [[paley-graph]] — Paley Graph
- [[parametrization-map]] — 参数化映射 (Parametrization Map)
- [[pareto-frontier-evaluation]] — Pareto 前沿评测 (Pareto Frontier Evaluation)
- [[paris-harrington-theorem]] — Paris-Harrington Theorem巴黎-哈灵顿定理)
- [[partially-observable-markov-game]] — 部分可观测马尔可夫博弈 (Partially Observable Markov Game, POMG)
- [[pass-at-k-vs-pass-k]] — Pass@k vs Pass^k能力上限 vs 可靠性下限)
- [[patch-based-evaluation]] — Patch-Based Evaluation (基于 Patch 的评测合约)
- [[path-tracing]] — 路径追踪 (Path Tracing)
- [[pdf-processing]] — PDF Processing
- [[peano-arithmetic]] — 皮亚诺算术 (Peano Arithmetic, PA)
- [[perception-cognition-recommendation]] — 感知-认知推荐层次 (R0-R3)
- [[perception-gap]] — 感知鸿沟 (Perception Gap)
- [[pldm]] — PLDM (Pretrained Latent Dynamics Model)
- [[poisson-process]] — 泊松过程 (Poisson Process)
- [[policy-constrained-execution]] — Policy-Constrained Execution
- [[policy-regret]] — 策略后悔 (Policy Regret)
- [[policy-reincarnation]] — Policy Reincarnation
- [[polysemanticity]] — 多义性与单义性 (Polysemanticity & Monosemanticity)
- [[pomdp]] — 部分可观测马尔可夫决策过程 (POMDP)
- [[position-encoding]] — Position Encoding (位置编码)
- [[position-id-discrepancy]] — Position ID Discrepancy (位置 ID 偏差)
- [[positive-sample-reinforcement]] — Positive Sample Reinforcement (PSR)
- [[post-action-configuration]] — 后动作配置 (Post-Action Configuration)
- [[post-hoc-reasoning-rl]] — 后置推理 RL (Post-Hoc Reasoning RL)
- [[post-train-space-rl]] — Post-train Space Reinforcement Learning
- [[posterior-lipschitz-adversary]] — 后验李普希茨对手 (Posterior-Lipschitz Adversary)
- [[practitioner-research-gap]] — Practitioner-Research Gap从业者-研究鸿沟)
- [[pre-activation-history]] — Pre-Activation History
- [[pre-hoc-reasoning-rl]] — 前置推理 RL (Pre-Hoc Reasoning RL)
- [[pre-train-space-reinforcement-learning]] — Pre-train Space Reinforcement Learning (PreRL)
- [[precision-weighted-fusion]] — 精度加权融合 (Precision-Weighted Fusion)
- [[predictive-representation-learning]] — 预测表征学习 (Predictive Representation Learning)
- [[preference-log-odds]] — Preference Log-Odds
- [[preference-utility-analysis]] — PreferenceUtility Analysis
- [[prefill-as-a-service]] — Prefill-as-a-Service (PrfaaS)
- [[prefill-decode-disaggregation]] — Prefill-Decode 分离架构 (PD Disaggregation)
- [[prefix-matching]] — Prefix Matching前缀匹配
- [[preserved-interactions-backbone]] — 保留交互作为推理支柱 (Preserved Interactions as Inference Backbone)
- [[primitive-completeness]] — Primitive Completeness (原语完备性)
- [[primitive-recursive-functions]] — 原始递归函数 (Primitive Recursive Functions)
- [[probabilistic-method]] — Probabilistic Method概率方法
- [[procedural-skill]] — 过程技能 (Procedural Skill)
- [[procedural-skill-layer]] — Procedural Skill Layer程序技能层
- [[procedural-task-execution]] — 程序性任务执行 (Procedural Task Execution)
- [[program-synthesis]] — Program Synthesis (程序合成)
- [[prompt-caching]] — Prompt Caching
- [[prompt-layering]] — Prompt Layering提示分层
- [[prompt-reverse-engineering]] — 图片反推 Prompt (Prompt Reverse Engineering)
- [[prompt-to-harness-evolution]] — Prompt-to-Harness Evolution三阶段工程演进
- [[prope]] — PRoPE (Projective Rotary Position Encoding)
- [[pydantic]] — Pydantic
- [[pydantic-ai]] — Pydantic AI
- [[pydantic-core]] — pydantic-core
- [[qlora]] — QLoRA (量化低秩适配)
- [[quadrotor-trajectory-following]] — 四旋翼轨迹跟踪 (Quadrotor Trajectory Following)
- [[query-intent-analyzer]] — Query Intent Analyzer
- [[question-quality-vs-quantity]] — Question Quality vs. Quantity问题质量 vs 数量)
- [[queueing-network-control]] — 排队网络控制 (Queueing Network Control)
- [[rag]] — RAG (检索增强生成)
- [[rag-systems]] — RAG 系统
- [[ramsey-context-cache]] — Ramsey Context Cache拉姆齐上下文缓存
- [[ramsey-context-graph]] — Ramsey Context Graph拉姆齐上下文图
- [[ramsey-context-template]] — Ramsey Context Template拉姆齐上下文模板
- [[ramsey-numbers]] — Ramsey Numbers拉姆齐数
- [[ramsey-theory]] — Ramsey Theory拉姆齐理论
- [[ramsey-theory-applications]] — Ramsey Theory Applications拉姆齐理论应用
- [[random-access-binding]] — Random-Access Binding (随机访问绑定)
- [[random-graph-theory]] — Random Graph Theory随机图理论
- [[real-life-context-learning]] — 真实生活上下文学习 (Real-Life Context Learning)
- [[real-log-canonical-threshold]] — 实对数典范阈值 (Real Log Canonical Threshold, RLCT)
- [[recommendation-cot]] — 推荐思维链 (Recommendation CoT)
- [[recommendation-reasoning]] — 推荐推理 (Recommendation Reasoning)
- [[rectified-flows]] — Rectified Flows
- [[recursive-reasoning-models]] — Recursive Reasoning Models递归推理模型
- [[recursive-self-improvement]] — Recursive Self-Improvement (递归自我改进)
- [[reer-reverse-knowledge-extraction]] — REER 逆向知识提炼
- [[reference-gap]] — 引用鸿沟 (Reference Gap)
- [[reinforcement-learning]] — 强化学习 (Reinforcement Learning)
- [[reinforcement-learning-trading]] — Reinforcement Learning Trading强化学习交易
- [[rejected-edit-buffer]] — Rejected-Edit Buffer (拒绝编辑缓冲)
- [[rejection-sampling-fine-tuning]] — Rejection Sampling Fine-tuning (RSFT)
- [[relational-graph]] — Relational Graph
- [[reliable-state-long-running-agents]] — Reliable State in Long-Running Agents长期运行中的可靠状态
- [[rep-mt-sac]] — RepMT-SAC
- [[reparameterization-exploration]] — 重参数化探索 (Reparameterization Exploration)
- [[replay-buffer-rl-llm]] — Replay Buffer 在 LLM RL 中的应用
- [[representation-alignment]] — Representation Alignment
- [[representation-collapse]] — 表征坍缩 (Representation Collapse)
- [[representation-learning-rl]] — RL中的表征学习 (Representation Learning in RL)
- [[representation-space]] — Representation Space
- [[representation-validity]] — Representation Validity
- [[resource-access-control]] — Resource Access Control
- [[reverse-proxy-authentication]] — 反向代理认证 (Reverse Proxy Authentication)
- [[reward-hacking-llm]] — LLM 奖励黑客 (Reward Hacking in LLMs)
- [[reward-model]] — 奖励模型 (Reward Model, RM)
- [[reward-recency-sampling]] — 奖励-最近度混合采样
- [[richard-dedekind]] — 里夏德·狄德金 (Richard Dedekind)
- [[risograph-print-style]] — Riso 印刷风格 (Risograph Print Style)
- [[rlhf]] — RLHF (Reinforcement Learning from Human Feedback)
- [[rlvr-unified-framework]] — RLVR 统一理论框架
- [[rolling-kv-cache]] — 滚动 KV 缓存 (Rolling KV Cache)
- [[rotary-position-embedding]] — 旋转位置编码 (RoPE)
- [[rough-path-theory]] — 粗糙路径理论 (Rough Path Theory)
- [[round-trip-reconstruction-score]] — Round-Trip Reconstruction Score (RS@k)
- [[rule-system-application]] — 规则系统应用 (Rule System Application)
- [[runtime-harness-adaptation]] — Runtime Harness Adaptation运行时骨架适配
- [[runtime-interface-adaptation]] — Runtime Interface Adaptation运行时接口适配
- [[russells-paradox]] — 罗素悖论 (Russell's Paradox)
- [[russian-constructivism]] — 俄国构成主义 (Russian Constructivism)
- [[s-token]] — S-Token (Superposed Token)
- [[safety-adherence-rate]] — Safety Adherence Rate
- [[scaling-permutation-symmetry]] — 缩放与置换对称性 (Scaling & Permutation Symmetries)
- [[scientific-literature-qa]] — Scientific Literature QA — Question Answering over Research Papers
- [[sde-sampler-language]] — SDE Sampler for Language Diffusion
- [[se3-relative-camera-encoding]] — SE(3) 相对相机编码
- [[searcher-trainer-decoupling]] — Searcher-Trainer 解耦架构
- [[section-ranking]] — Section Ranking — Structure-Aware Literature Section Prioritization
- [[secure-containers]] — 安全容器
- [[seer-attention]] — SeerAttention
- [[self-conditioning]] — Self-Conditioning
- [[self-evolving-agents]] — Self-Evolving Agents (自进化 Agent)
- [[self-evolving-benchmark]] — 自进化基准 (Self-Evolving Benchmark)
- [[self-improving-ai]] — Self-Improving AI (自我改进人工智能)
- [[self-reference]] — 自指 (Self-Reference)
- [[self-verification-rewards]] — 自我验证奖励 (Self-Verification Rewards)
- [[semantic-equivalence]] — Semantic Equivalence / 语义等价
- [[semi-algebraic-set]] — 半代数集 (Semi-algebraic Set)
- [[sequence-packing]] — Sequence Packing (序列打包)
- [[set-theory-history]] — 集合论史
- [[sft-denoising-stage]] — SFT 去噪阶段 (SFT Denoising Stage)
- [[sft-early-stopping]] — SFT 早停策略 (SFT Early Stopping)
- [[shadow-calling]] — Shadow Calling (影子调用)
- [[shapley-values]] — Shapley 值 (Shapley Values)
- [[shared-parameter-influence]] — Shared Parameter Influence
- [[shared-weight-discretization]] — Shared-Weight Discretization
- [[signature]] — 签名 (Signature of Paths)
- [[sigreg]] — SIGReg (Sketch Isotropic Gaussian Regularization)
- [[singular-learning-theory]] — 奇异学习理论 (Singular Learning Theory)
- [[singularity]] — Singularity (奇点)
- [[sink-token]] — 汇 Token (Sink Token)
- [[skill-as-external-state]] — Skill as External State (Skill 作为外部状态)
- [[skill-data-flywheel]] — Skill Data Flywheel (Skill 数据飞轮)
- [[skill-ecosystem]] — Skill Ecosystem (Skill 生态系统)
- [[skill-probe]] — 技能探针 (Skill Probe)
- [[skillopt]] — SkillOpt
- [[slow-meta-update]] — Slow/Meta Update (慢/元更新)
- [[soft-actor-critic]] — Soft Actor-Critic (SAC)
- [[soft-token]] — Soft Token
- [[softmax-off-by-one]] — SoftMax-off-by-One
- [[sovereign-ai]] — 主权AI (Sovereign AI)
- [[sparse-attention-patterns]] — 稀疏注意力模式 (Sparse Attention Patterns)
- [[sparse-autoencoder]] — 稀疏自编码器 (Sparse Autoencoder)
- [[specialist-training-pipeline]] — Specialist Training Pipeline
- [[specialize-then-unify-rl]] — Specialize-then-Unify RL
- [[specialized-rl]] — 专项强化学习 (Specialized RL)
- [[specialized-sft]] — 专项监督微调 (Specialized SFT)
- [[spectral-mdp-decomposition]] — 谱 MDP 分解 (Spectral MDP Decomposition)
- [[spiking-neural-networks]] — Spiking Neural Networks (SNN)
- [[split-steering]] — SPLIT Steering
- [[spurious-predictability]] — Spurious Predictability
- [[stage-matched-data-config]] — Stage-Matched Data Configuration (分阶段数据配置)
- [[standard-agent-handoffs]] — Standard Agent Handoffs标准化 Agent 交接)
- [[state-dependent-feasible-action-sets]] — 状态依赖可行动作集 (State-Dependent Feasible Action Sets)
- [[staug]] — STAug (EMD-based Augmentation)
- [[steering-dynamics]] — Steering Dynamics
- [[steering-vector]] — Steering Vector
- [[stem-sparse-attention]] — Stem Sparse Attention
- [[stochastic-differential-equation]] — 随机微分方程 (Stochastic Differential Equation)
- [[stochastic-latent-trajectory]] — Stochastic Latent Trajectory随机潜在轨迹
- [[strategy-engineering-unification]] — Strategy-Engineering Unification (策略与工程统一)
- [[strategy-gene]] — 策略基因 (Strategy Gene)
- [[structured-knowledge]] — 结构化知识 (Structured Knowledge)
- [[structured-output]] — 结构化输出 (Structured Output)
- [[stub-pattern]] — Stub Pattern轻量化桩模式
- [[subquadratic-transformer-alternatives]] — 次二次 Transformer 替代方案
- [[sufficiency-check]] — Sufficiency Check — Explicit Hallucination Gate in Literature QA
- [[sufficient-context-paradox]] — 充分上下文悖论 (Sufficient Context Paradox)
- [[superposition]] — 叠加 (Superposition)
- [[supervised-fine-tuning]] — 监督微调 (Supervised Fine-Tuning, SFT)
- [[swe-bench]] — SWE-bench
- [[symbolic-backpropagation]] — Symbolic Back-Propagation (符号反向传播)
- [[symbolic-network]] — Symbolic Network (符号网络)
- [[symbolic-regression]] — Symbolic Regression
- [[synapse-model]] — Synapse Model
- [[synthetic-data]] — 合成数据 (Synthetic Data)
- [[synthetic-data-qa-generation]] — Synthetic Data QA Generation (合成数据Q&A生成)
- [[system-2-thinking]] — System 2 思维
- [[system-message-abuse]] — System Message Abuse系统消息滥用
- [[system-stability]] — System Stability
- [[szemerédi-regularity-lemma]] — Szemerédi Regularity Lemma
- [[tabular-foundation-models]] — Tabular Foundation Models
- [[tapestry-federated]] — Tapestry 联邦训练
- [[task-conditioned-policy]] — 任务条件策略 (Task-Conditioned Policy)
- [[task-distribution]] — 任务分布 (Task Distribution)
- [[task-invariant-representation]] — 任务不变表征 (Task-Invariant Representation)
- [[taylor-expansion-q-function]] — Q 函数 Taylor 展开 (Taylor Expansion of Q-Function)
- [[tba]] — Trajectory Balance with Asynchrony (TBA)
- [[teacher-forced-history]] — 教师强制历史 (Teacher-Forced History)
- [[temporal-decay-neural]] — Temporal Decay (Neural)
- [[temporal-patch-shuffle]] — Temporal Patch Shuffle (TPS)
- [[temporal-point-process]] — 时间点过程 (Temporal Point Process)
- [[temporal-rollout]] — 时间滚动展开 (Temporal Rollout)
- [[terminal-bench]] — Terminal-Bench
- [[test-time-control]] — 测试时控制 (Test-Time Control)
- [[test-time-scaling]] — Test-Time Scaling
- [[test-time-training-rl]] — 测试时训练 RL (Test-Time Training with RL)
- [[text-space-optimizer]] — Text-Space Optimizer (文本空间优化器)
- [[text-vs-weight-optimization]] — Text vs Weight Optimization (文本 vs 权重优化)
- [[textual-learning-rate]] — Textual Learning Rate (文本学习率)
- [[thinking-supervision-transfer]] — Thinking Supervision Transfer
- [[thompson-sampling-code-search]] — Thompson Sampling Code Search
- [[three-engineering-phases]] — Three Engineering Phases三阶段工程演进
- [[three-stage-curriculum-training]] — 三阶段课程训练 (Three-Stage Curriculum Training)
- [[throughput-hypothesis]] — Throughput Hypothesis (吞吐量假说)
- [[time-series-forecasting-augmentation]] — Time Series Forecasting Augmentation
- [[time-variant-dynamics]] — Time-variant Dynamics时变动力学
- [[token-as-economic-primitive]] — Token as Economic Primitive
- [[token-duplication]] — Token Duplication (Token 复制)
- [[token-economics]] — Token Economics
- [[token-efficiency]] — Token 效率 (Token Efficiency)
- [[token-market-dynamics]] — Token Market Dynamics
- [[token-position-decay]] — Token Position-Decay (TPD)
- [[token-security-economics]] — Token Security Economics
- [[token-superposition-training]] — Token Superposition Training (TST)
- [[token-wise-routing]] — 逐Token路由 (Token-Wise Routing)
- [[tool-bootstrapped-rft]] — Tool-Bootstrapped GUI RFT
- [[tool-efficient-path-reward]] — Tool-Efficient Path Reward
- [[tool-interface]] — Tool Interface & Protocol Layer工具接口与协议层
- [[tool-registry]] — ToolRegistry
- [[tpp-applications]] — TPP 应用场景
- [[tpp-training-methods]] — TPP 训练方法
- [[trace-native-evaluation]] — Trace-Native Evaluation踪迹原生评估
- [[trading-lifecycle-driven-eviction]] — Trading-Lifecycle Driven Eviction (交易生命周期驱动淘汰)
- [[trajectory-auditing]] — Trajectory Auditing
- [[trajectory-balance-objective]] — Trajectory Balance (TB) 目标
- [[trajectory-regulation-layer]] — Trajectory Regulation Layer轨迹调控层
- [[transfer-learning]] — Transfer Learning (迁移学习)
- [[two-phase-pretraining]] — Two-Phase Pre-Training
- [[two-time-scale-process]] — 双时间尺度过程 (Two Time-Scale Process)
- [[type-safety-in-agents]] — Agent 类型安全 (Type Safety in Agents)
- [[typeadapter]] — TypeAdapter
- [[ultradata]] — UltraData
- [[uncancelled-interaction-effects]] — 未抵消交互效应 (Uncancelled Interaction Effects)
- [[uncertainty-disparity-ratio]] — 不确定性差异比 (Uncertainty Disparity Ratio, UDR)
- [[uncertainty-equity-gap]] — 不确定性公平性差距 (Uncertainty Equity Gap, UEG)
- [[uncertainty-quantification]] — 不确定性量化 (Uncertainty Quantification)
- [[unconditional-generation-latent]] — Unconditional Generation via Latent Reasoning
- [[unified-rft]] — 统一拒绝采样微调 (Unified RFT)
- [[universal-approximation-theorem]] — 通用逼近定理 (Universal Approximation Theorem)
- [[unsupervised-rlvr]] — 无监督可验证奖励强化学习 (URLVR)
- [[update-magnitude-imbalance]] — GRPO 更新幅度不平衡
- [[upstream-downstream-learning]] — 上游-下游学习 (Upstream-Downstream Learning)
- [[userspace-kernel]] — 用户空间内核
- [[validity-decay]] — Validity Decay
- [[van-der-waerden-theorem]] — van der Waerden Theorem
- [[variational-autoencoder]] — 变分自编码器 (Variational Autoencoder, VAE)
- [[variational-linearized-laplace-approximation]] — 变分线性化 Laplace 近似 (VaLLA)
- [[verification-evaluation]] — Verification & Evaluation验证与评估
- [[vertical-llm-knowledge-engineering]] — 垂域 LLM 知识工程 (Vertical LLM Knowledge Engineering)
- [[vicreg]] — VICReg (Variance-Invariance-Covariance Regularization)
- [[visibility-constraint]] — Visibility Constraint (可见性约束)
- [[visual-primitives]] — 视觉原语 (Visual Primitives)
- [[vla-vision-language-action]] — VLA (Vision-Language-Action)
- [[watanabe-triple]] — Watanabe 三元组 (Watanabe's Triple)
- [[wavemask-wavemix]] — WaveMask / WaveMix
- [[weak-revealing-condition]] — 弱揭示条件 (Weak Revealing Condition)
- [[weighted-spaces]] — 加权空间 (Weighted Spaces)
- [[width-based-scaling]] — Width-Based Scaling宽度扩展
- [[wiener-process]] — 维纳过程 (Wiener Process)
- [[wikilinks]] — Wikilinks
- [[window-attention]] — 窗口注意力 (Window Attention)
- [[world-model-lecun]] — LeCun 世界模型理论
- [[world-models-rl]] — World Models in RL
- [[worst-case-threat-model]] — 最坏情况威胁模型
- [[x-prediction-parameterization]] — x-Prediction Parameterization
- [[zero-cost-proxies]] — Zero-Cost Proxies (ZCP)
- [[zero-data-cold-start]] — 零数据冷启动 (Zero-Data Cold Start)
## Papers
- [[advances-temporal-point-processes-2026]] — Advances in Temporal Point Processes: Bayesian, Neural, and LLM Approaches
- [[agarwal-bayesian-attention-geometry]] — The Bayesian Geometry of Transformer Attention
- [[agent-harness-engineering-survey]] — Agent Harness Engineering: A Survey
- [[bartoldson-tba-2025]] — TBA: 异步轨迹平衡 — 解耦探索与学习以实现快速可扩展的 LLM 后训练
- [[behrouz-memory-caching-rnn]] — Memory Caching: RNNs with Growing Memory
- [[bellman-taylor-score-decoding]] — BellmanTaylor Score Decoding for MDPs with State-Dependent Feasible Action Sets
- [[chen-token-economics-llm-agents]] — Token Economics for LLM Agents
- [[claw-swe-bench]] — Claw-SWE-Bench: OpenClaw 风格 Agent Harness 的代码任务基准评测
- [[clawless-ai-agent-security]] — ClawLess: AI 代理安全模型
- [[dai-mathforge-2026]] — MathForge: Harder Is Better — 难度感知GRPO与多维度问题改写
- [[darlow-ctm-2025]] — Continuous Thought Machines (CTM)
- [[dead-directions-geometric-singular-learning]] — Dead Directions: 几何奇异学习理论
- [[deepseek-v4-million-token-context]] — DeepSeek-V4: 迈向高效百万 Token 上下文智能
- [[dou-cl-bench]] — CL-bench: 上下文学习基准——首篇定义context learning范式的论文
- [[elf-embedded-language-flows]] — ELF: Embedded Language Flows
- [[flex4dhuman]] — Flex4DHuman: 灵活多视角视频扩散用于 4D 人体重建
- [[geometric-sae-concepts]] — A Geometric View for Understanding Concept Learning and Neuron Interpretation in
- [[godel-incompleteness-tutorial]] — 哥德尔不完备定理教程
- [[goru-one-pass-to-reason-2025]] — One-Pass to Reason: 多轮推理的高效单遍微调
- [[gram-generative-recursive-reasoning-paper]] — Generative Recursive Reasoning (GRAM)
- [[he-urlvr-sharpening-2026]] — How Far Can Unsupervised RLVR Scale LLM Training?
- [[hunyuan-team-cl-bench-life]] — CL-Bench Life: 真实生活上下文学习基准
- [[kore-knowledge-injection]] — KORE: Knowledge-Oriented Controls for Knowledge Injection
- [[laban-llms-corrupt-documents-delegate]] — LLMs Corrupt Your Documents When You Delegate
- [[li-amd-human-perception]] — "Are You Sure?": Human Perception Vulnerability in LLM Agents
- [[liu-auditing-agent-harness-safety]] — Auditing Agent Harness Safety
- [[liu-koopa-2023]] — Koopa: Koopman 预测器驱动的非平稳时间序列学习
- [[llm-attention-survey-2026]] — 大语言模型注意力机制全面分析
- [[lou-autoharness-2026]] — AutoHarness: LLM Agent 的自动代码 Harness 合成
- [[ma-intragent-2026]] — IntrAgent: Content-Grounded Literature Information Retrieval
- [[maes-leworldmodel-2026]] — LeWorldModel: Stable End-to-End JEPA from Pixels
- [[minimax-policy-regret-pomg]] — Minimax-Optimal Policy Regret in Partially Observable Markov Games
- [[nikolopoulos-spurious-predictability]] — Spurious Predictability in Financial Machine Learning
- [[niu-stem-causal-sparse-attention]] — Stem: Rethinking Causal Information Flow in Sparse Attention
- [[odrzywolek-eml-single-operator]] — All elementary functions from a single binary operator
- [[onereason]] — OneReason: 生成式推荐中的推理能力解锁
- [[ortega-phd-thesis]] — Uncertainty Estimation and Generalization Bounds for Modern Deep Learning
- [[peng-tst-2026]] — Token Superposition Training: 高效 LLM 预训练的 Token 叠加方法
- [[pre-train-space-reinforcement-learning]] — Pre-train Space Reinforcement Learning (PreRL/DSRL)
- [[predictive-representations-scalable-mtrl]] — 预测表征驱动可扩展多任务深度强化学习
- [[principled-uncertainty-clinical-ai]] — Principled Uncertainty in Clinical AI: Bayesian Modelling and Equity Auditing
- [[procedural-skills-to-strategy-genes]] — From Procedural Skills to Strategy Genes: Towards Experience-Driven Test-Time Ev
- [[qin-prfaas-cross-datacenter]] — Prefill-as-a-Service: KVCache Goes Cross-Datacenter
- [[ramsey-numbers-survey]] — 拉姆齐数的数学综述
- [[relu-neuromanifolds-semi-algebraicity]] — ReLU 神经流形的纤维与半代数性
- [[repmt-sac]] — Learning to Adapt: Representation-Based RL for Multi-Task Skill Transfer
- [[song-agent-network-taxonomy]] — Complex networks of AI agentic systems: 拓扑-记忆-更新三层分类法
- [[streaming-llm]] — StreamingLLM: 基于注意力汇的高效流式语言模型
- [[tao-klowden-ai-mathematical-methods]] — Mathematical methods and human thought in the age of AI
- [[tarpo]] — TARPO: Token-Wise Latent-Explicit Reasoning via Action-Routing Policy Optimizati
- [[thinking-with-visual-primitives]] — Thinking with Visual Primitives — 以视觉原语思考
- [[ticks-to-flows]] — From Ticks to Flows: Dynamics of Neural RL in Continuous Environments
- [[toolcua-optimal-gui-tool-orchestration]] — ToolCUA: Optimal GUI-Tool Path Orchestration for Computer Use Agents
- [[weighted-uat-manifolds]] — Weighted Universal Approximation of Differentiable Maps on Infinite-Dimensional
- [[when-large-multimodal-models-confront-evolving-knowledge]] — When Large Multimodal Models Confront Evolving Knowledge
- [[xing-trails-2024]] — Trails: Database Native Model Selection (VLDB 2024)
- [[xu-life-harness]] — Adapting the Interface, Not the Model: Runtime Harness Adaptation for Determinis
- [[xu-why-steering-works]] — Why Steering Works: Toward a Unified View of Language Model Parameter Dynamics
- [[yang-skillopt-2026]] — SkillOpt: Agent Skill 的文本空间优化器
- [[zeng-dynamic-model-slicing-2024]] — Powering In-Database Dynamic Model Slicing for Structured Data Analytics (VLDB 2
- [[zeng-neurida-2025]] — NeurIDA: Dynamic Modeling for Effective In-Database Analytics
- [[zhang-hyperagents]] — Hyperagents: Self-Referential Agents with Metacognitive Self-Modification
- [[zhang-reconciling-sft-interaction-2026]] — Reconciling Contradictory Views on the Effectiveness of SFT in LLMs
- [[zhao-neurdb-2025]] — NeurDB: On the Design and Implementation of an AI-powered Autonomous Database (C
- [[zhou-agent-symbolic-learning-2024]] — Agent Symbolic Learning: 用符号学习实现自进化 Agent
- [[zhu-moda-mixture-of-depths]] — Mixture-of-Depths Attention (MoDA)
## Articles
- [[caddy-reverse-proxy-auth]] — Caddy 反向代理认证方案
- [[cantor-stole-infinity]] — 窃取无穷的数学家 — 康托尔与狄德金的隐秘合作
- [[claw-eval]] — Claw-Eval面向自主Agent的端到端评测框架
- [[crawl4ai-open-source-web-crawler]] — Crawl4AI赋能AI用户的开源智能网页爬虫与数据提取工具
- [[distributed-agent-cache-sync-2026]] — 分布式Agent缓存同步从单机到多机的Prompt Caching架构升级
- [[gpt-image2-prompt-collection]] — GPT-Image-2 绘图 Prompt 方法论与风格合集
- [[lecun-llm-boundary-future]] — LeCun 论 LLM 的边界与未来架构
- [[lyu-model-harness-evolution-2026]] — Model与Harness的关系演进从AutoHarness到Heuristic Learning
- [[lyu-skillopt-deep-dive-2026]] — SkillOpt深度解读自进化Agent技能的'反向传播'与工程化Continued Evolve
- [[mini-agent-harness]] — 从零搭建 Mini Agent Harness
- [[oppo-multimodal-data-lake]] — OPPO 多模态数据湖架构实践
- [[prompt-caching-architecture]] — Prompt Caching 架构工程手册
- [[pydantic-three-piece-suite]] — Pydantic 三件套:从校验库到 AI 基础设施
- [[qifu-llm-finance-practice]] — 金融行业大模型落地实践:从知识工程到后训练部署
- [[ramsey-context-construction]] — 上下文构造与拉姆齐数
- [[temporal-patch-shuffle-tps]] — 时序预测增强方法综述:从频域到 TPS
- [[ultradata-l3-open-source-2026]] — UltraData面壁智能L3数据开源与数据分级治理体系
## Special Pages
- [[index]] —
- [[log]] —
- [[README]] —
- [[SCHEMA]] —
## Reviews
- [[advances-temporal-point-processes-review-20260616]] — Review: Advances in Temporal Point Processes
- [[agent-harness-engineering-review-20260523]] — Review: Agent Harness Engineering Survey
- [[agent-network-taxonomy-review-20260501]] — agent-network-taxonomy-review-20260501
- [[auditing-agent-harness-safety-review-20260605]] — Auditing Agent Harness Safety — Review
- [[btsd-review-20260617]] — Bellman-Taylor Score Decoding 论文集成 Review
- [[cantor-stole-infinity-2026-06-07]] — 窃取无穷的数学家 — 康托尔与狄德金的历史真相
- [[cl-bench-life-review-20260501]] — CL-Bench Life 论文集成 Review
- [[cl-bench-review-20260501]] — cl-bench-review-20260501
- [[claw-swe-bench-review-20260615]] — Claw-SWE-Bench 论文集成 Review
- [[clawless-review-20260422]] — ClawLess: AI 代理安全模型 - Review 报告
- [[ctm-review-20260515]] — Continuous Thought Machines 论文集成 Review
- [[dead-directions-20260610]] — Review: Dead Directions — Geometric Singular Learning
- [[delegate52-review-20260514]] — DELEGATE-52 Review
- [[distributed-agent-cache-sync-review]] — Review: 分布式Agent缓存同步
- [[elf-embedded-language-flows-review-20260513]] — Review: ELF — Embedded Language Flows
- [[flex4dhuman-review-20260613]] — Review: Flex4DHuman — 无几何先验的多视角视频扩散
- [[geometric-sae-review-20260617]] — Geometric SAE 论文集成 Review
- [[godel-tutorial-review-20260428]] — 哥德尔不完备定理教程 — Review 报告
- [[hyperagents-review-20260420]] — 📚 Wiki 添加 Review 报告 - Hyperagents 论文
- [[koopa-review-20260511]] — Review: Koopa — Koopman 预测器驱动的非平稳时序学习
- [[kore-review-20260521]] — KORE Review
- [[lecun-llm-20260608]] — Review: LeCun 论 LLM 的边界与未来架构
- [[leworldmodel-20260608]] — Review: LeWorldModel (arXiv:2603.19312)
- [[life-harness-review-20260611]] — Life-Harness — Runtime Harness Adaptation 论文 Review
- [[llm-attention-survey-review-20260429]] — Review: 大语言模型注意力机制全面分析
- [[lou-autoharness-review]] — Review: AutoHarness — 自动合成代码 Harness 改进 LLM Agent
- [[lyu-model-harness-review]] — Review: Model与Harness的关系演进
- [[lyu-skillopt-deep-dive-review]] — Review: SkillOpt深度解读 — 自进化Agent的'反向传播'
- [[ma-intragent-review-20260604]] — IntrAgent — Content-Grounded Literature Retrieval Review
- [[mathforge-review-20260512]] — MathForge Review — 2026-05-12
- [[minimax-policy-regret-pomg-20260610]] — Review: Minimax-Optimal Policy Regret in POMGs
- [[neurida-review-20260515]] — NeurIDA 论文集成 Review
- [[one-pass-to-reason-review-20260602]] — Review: One-Pass to Reason — 多轮推理的高效单遍微调
- [[onereason-review-20260610]] — OneReason Review — 生成式推荐的推理能力解锁
- [[ortega-phd-review-20260617]] — Ortega PhD Thesis 集成 Review
- [[peng-tst-2026-review]] — Review: Token Superposition Training
- [[predictive-representations-mtrl-20260610]] — Review: Predictive Representations for Scalable Multitask Deep RL
- [[pretrain-space-rl-review-20260518]] — Review: Pre-train Space Reinforcement Learning
- [[principled-uncertainty-clinical-ai-20260610]] — Review: Principled Uncertainty in Clinical AI
- [[prompt-caching-architecture-review-20260511]] — Review: Prompt Caching 架构工程手册
- [[pydantic-three-piece-review-20260610]] — Pydantic 三件套 Review — 从校验库到 AI 基础设施
- [[ramsey-context-construction-review-20260511]] — Review: 上下文构造与拉姆齐数
- [[ramsey-numbers-survey-review-20260511]] — Review: 拉姆齐数的数学综述
- [[relu-neuromanifolds-20260610]] — Review: ReLU Neuromanifolds — Fibers and Semi-algebraicity
- [[repmt-sac-review-20260617]] — RepMT-SAC 论文集成 Review
- [[skills-to-genes-review-20260614]] — Skills to Strategy Genes — Review 报告
- [[stem-causal-sparse-attention-review-20260605]] — Stem: Rethinking Causal Information Flow in Sparse Attention — Review
- [[streaming-llm-review-20260514]] — Review: StreamingLLM — 基于注意力汇的无限长流式语言模型
- [[tarpo-review-20260617]] — TARPO 论文集成 Review
- [[tba-review-20260512]] — TBA Review — 2026-05-12
- [[thinking-with-visual-primitives-review-20260430]] — Review — Thinking with Visual Primitives
- [[ticks-to-flows-review-20260617]] — Ticks-to-Flows 论文集成 Review
- [[token-economics-review-20260605]] — Token Economics for LLM Agents — Review
- [[toolcua-review-20260531]] — ToolCUA Review: GUI-Tool路径编排的概念网络分析
- [[ultradata-l3-review]] — Review: UltraData — 大模型数据分级治理的开源实践
- [[weighted-uat-review-20260617]] — Weighted UAT 论文集成 Review
- [[xu-why-steering-works-review-20260601]] — Review: Why Steering Works — 参数动态统一视角
- [[yang-skillopt-review]] — Review: SkillOpt — Agent Skill 的文本空间优化器
- [[zhang-sft-interaction-review-20260603]] — Review: Reconciling Contradictory Views on SFT in LLMs — 交互视角
- [[zhou-agent-symbolic-learning-review]] — Review: Agent Symbolic Learning — 符号学习驱动的自进化Agent