20260422:更新
This commit is contained in:
77
raw/papers/lu-hongyi-clawless-ai-agent-security-2026.md
Normal file
77
raw/papers/lu-hongyi-clawless-ai-agent-security-2026.md
Normal file
@@ -0,0 +1,77 @@
|
||||
# ClawLess: A Security Model of AI Agents
|
||||
|
||||
**arXiv ID**: 2604.06284v1
|
||||
**Authors**: Hongyi Lu, Nian Liu, Shuai Wang, Fengwei Zhang
|
||||
**Date**: 7 Apr 2026
|
||||
**Category**: cs.CR (Cryptography and Security)
|
||||
**Institutions**: Southern University of Science and Technology, Hong Kong University of Science and Technology
|
||||
|
||||
## Abstract
|
||||
|
||||
Autonomous AI agents powered by Large Language Models can reason, plan, and execute complex tasks, but their ability to autonomously retrieve information and run code introduces significant security risks. Existing approaches attempt to regulate agent behavior through training or prompting, which does not offer fundamental security guarantees. We present ClawLess, a security framework that enforces formally verified policies on AI agents under a worst-case threat model where the agent itself may be adversarial. ClawLess formalizes a fine-grained security model over system entities, trust scopes, and permissions to express dynamic policies that adapt to agents' runtime behavior. These policies are translated into concrete security rules and enforced through a user-space kernel augmented with BPF-based syscall interception. This approach bridges the formal security model with practical enforcement, ensuring security regardless of the agent's internal design.
|
||||
|
||||
## Key Concepts
|
||||
|
||||
### 1. ClawLess Framework
|
||||
A comprehensive security framework for autonomous AI agents that combines formal verification with practical enforcement mechanisms.
|
||||
|
||||
### 2. Formal Security Model
|
||||
A fine-grained model capturing entities, scopes, and permissions across multiple system domains, enabling precise security policy specification.
|
||||
|
||||
### 3. User-space Kernel
|
||||
A trusted layer between potentially malicious AI agents and the vulnerable host kernel, providing isolation while maintaining usability.
|
||||
|
||||
### 4. BPF-based Syscall Interception
|
||||
Using Berkeley Packet Filter (BPF) to intercept and enforce security policies on system calls made by AI agents.
|
||||
|
||||
### 5. Worst-case Threat Model
|
||||
Assumes AI agents are capable of sophisticated attacks and will eventually be lured into malicious behavior, requiring security that doesn't rely on agent cooperation.
|
||||
|
||||
### 6. Secure Containers Comparison
|
||||
Analysis of different container technologies (Docker, user-space kernels, virtualization, confidential containers) in terms of compatibility, interoperability, deployability, and security.
|
||||
|
||||
## Core Contributions
|
||||
|
||||
1. **First comprehensive security analysis** for autonomous AI agents with two fundamental assumptions about AI agent security.
|
||||
2. **Formalized fine-grained security model** that prevents agents from abusing capabilities while maintaining usability.
|
||||
3. **ClawLess implementation** - an isolation framework that enforces formally verified security policies on AI agents.
|
||||
|
||||
## Methodology
|
||||
|
||||
### Security Challenges Addressed
|
||||
- **Ambiguous Trust Boundary**: AI agents retrieve data from diverse sources, blurring trusted/untrusted boundaries.
|
||||
- **Privilege/Usability Trade-off**: Balancing agent capabilities with security risks.
|
||||
- **Security for Autonomous Software**: Traditional mechanisms inadequate for non-deterministic LLM behavior.
|
||||
|
||||
### Technical Approach
|
||||
1. **Formal Policy Specification**: Define security policies using formal methods.
|
||||
2. **Policy Compilation**: Translate high-level policies into concrete system call rules.
|
||||
3. **Runtime Enforcement**: Use user-space kernel with BPF interception to enforce policies.
|
||||
4. **Isolation Architecture**: Deploy agents in secure containers with user-space kernel protection.
|
||||
|
||||
## Key Findings
|
||||
|
||||
1. **Docker Vulnerabilities**: Standard Docker has 37 CVEs over past ten years, including 5 high-severity vulnerabilities (>9.0 CVSS).
|
||||
2. **User-space Kernel Security**: Only one CVE in past ten years, providing better security while maintaining usability.
|
||||
3. **Formal Verification Gap**: Existing approaches lack formal verification for dynamic, tool-using AI agent behavior.
|
||||
|
||||
## Related Work
|
||||
|
||||
- **ACE**: Security architecture for LLM-integrated app systems
|
||||
- **IsolateGPT**: Execution isolation architecture for LLM-based agentic systems
|
||||
- **NeuroFilter**: Privacy guardrails for conversational LLM agents
|
||||
- **ExpGuard**: LLM content moderation in specialized domains
|
||||
|
||||
## Implications
|
||||
|
||||
ClawLess provides a principled foundation for securing increasingly capable autonomous AI agents, moving beyond training/prompting-based approaches to formal verification and runtime enforcement.
|
||||
|
||||
## References
|
||||
|
||||
1. Hongyi Lu, Nian Liu, Shuai Wang, Fengwei Zhang. "ClawLess: A Security Model of AI Agents". arXiv:2604.06284v1 [cs.CR], 2026.
|
||||
2. Related papers cited in the references section.
|
||||
|
||||
---
|
||||
*Created: 2026-04-22*
|
||||
*Source: arXiv:2604.06284v1*
|
||||
*Integration: Wiki knowledge base*
|
||||
92
raw/papers/nikolopoulos-spurious-predictability-2026.md
Normal file
92
raw/papers/nikolopoulos-spurious-predictability-2026.md
Normal file
@@ -0,0 +1,92 @@
|
||||
# Spurious Predictability in Financial Machine Learning
|
||||
|
||||
**Authors:** Sotirios D. Nikolopoulos
|
||||
**arXiv ID:** 2604.15531v1
|
||||
**Published:** 2026-04-16
|
||||
**Categories:** q-fin.ST, stat.ME, stat.ML
|
||||
**Comments:** 49 pages, 10 figures. The QuantAudit R package and full replication scripts will be made publicly available upon journal publication
|
||||
**Subjects:** Statistical Finance (q-fin.ST); Methodology (stat.ME); Machine Learning (stat.ML)
|
||||
**MSC classes:** 91G70, 62P20, 62M20, 68T05
|
||||
**DOI:** https://doi.org/10.48550/arXiv.2604.15531
|
||||
|
||||
## Abstract
|
||||
|
||||
Adaptive specification search generates statistically significant backtests even under martingale-difference nulls. We introduce a falsification audit testing complete predictive workflows against synthetic reference classes, including zero-predictability environments and microstructure placebos. Workflows generating significant walk-forward evidence in these environments are falsified. For passing workflows, we quantify selection-induced performance inflation using an absolute magnitude gap linking optimized in-sample evidence to disjoint walk-forward realizations, adjusted for effective multiplicity. Simulations validate extreme-value scaling under correlated searches and demonstrate detection power under genuine structure. Empirical case studies confirm that many apparent findings represent methodological artifacts rather than genuine predictability.
|
||||
|
||||
## Key Concepts
|
||||
|
||||
### 1. Spurious Predictability
|
||||
The phenomenon where adaptive specification search (data mining, model selection, hyperparameter tuning) can generate statistically significant backtest results even when the underlying data-generating process has no genuine predictive structure (martingale-difference nulls).
|
||||
|
||||
### 2. Falsification Audit
|
||||
A methodological framework for testing complete predictive workflows against synthetic reference classes:
|
||||
- **Zero-predictability environments**: Simulated data with no genuine predictive structure
|
||||
- **Microstructure placebos**: Realistic but non-predictive market microstructure features
|
||||
|
||||
### 3. Selection-Induced Performance Inflation
|
||||
The bias introduced by model selection and optimization, quantified as the gap between:
|
||||
- Optimized in-sample performance
|
||||
- Out-of-sample (walk-forward) performance on disjoint data
|
||||
|
||||
### 4. Effective Multiplicity
|
||||
Adjustment for the multiple comparisons problem in adaptive specification search, accounting for correlated search paths and dependencies between model specifications.
|
||||
|
||||
## Methodology
|
||||
|
||||
### Falsification Framework
|
||||
1. **Reference class construction**: Create synthetic environments with known properties
|
||||
2. **Workflow testing**: Apply the complete predictive workflow to reference classes
|
||||
3. **Falsification criteria**: Reject workflows that show significant predictive power in zero-predictability environments
|
||||
|
||||
### Performance Gap Quantification
|
||||
For workflows that pass falsification tests:
|
||||
1. **In-sample optimization**: Measure performance on training data
|
||||
2. **Walk-forward validation**: Test on disjoint out-of-sample periods
|
||||
3. **Gap calculation**: Compute absolute magnitude difference adjusted for effective multiplicity
|
||||
|
||||
## Empirical Findings
|
||||
|
||||
### Case Studies
|
||||
The paper presents empirical case studies demonstrating that many apparent findings in financial machine learning represent methodological artifacts rather than genuine predictability.
|
||||
|
||||
### Implications
|
||||
1. **Methodological rigor**: Need for robust validation frameworks
|
||||
2. **Publication bias**: Tendency to publish positive results without proper falsification
|
||||
3. **Replication crisis**: Similar challenges as in other empirical sciences
|
||||
|
||||
## Technical Contributions
|
||||
|
||||
### 1. QuantAudit R Package
|
||||
The authors will release an R package implementing the falsification audit framework.
|
||||
|
||||
### 2. Statistical Framework
|
||||
- Extreme-value theory for correlated searches
|
||||
- Effective multiplicity adjustments
|
||||
- Walk-forward validation protocols
|
||||
|
||||
### 3. Simulation Studies
|
||||
Validation of the framework's detection power under various data-generating processes.
|
||||
|
||||
## Related Concepts
|
||||
|
||||
- [[cramer-rao-lower-bound]] - Theoretical bounds on parameter estimation
|
||||
- [[computerized-adaptive-testing]] - Adaptive testing methodologies
|
||||
- [[symbolic-regression]] - Machine learning for discovering mathematical expressions
|
||||
- [[formal-verification]] - Formal methods for validation
|
||||
|
||||
## References
|
||||
|
||||
- arXiv: https://arxiv.org/abs/2604.15531
|
||||
- PDF: https://arxiv.org/pdf/2604.15531
|
||||
- HTML: https://arxiv.org/html/2604.15531v1
|
||||
|
||||
## BibTeX
|
||||
|
||||
```bibtex
|
||||
@article{nikolopoulos2026spurious,
|
||||
title={Spurious Predictability in Financial Machine Learning},
|
||||
author={Nikolopoulos, Sotirios D.},
|
||||
journal={arXiv preprint arXiv:2604.15531},
|
||||
year={2026}
|
||||
}
|
||||
```
|
||||
89
raw/papers/zhang-hyperagents-2026.md
Normal file
89
raw/papers/zhang-hyperagents-2026.md
Normal file
@@ -0,0 +1,89 @@
|
||||
# Hyperagents (arXiv:2603.19461)
|
||||
|
||||
## Metadata
|
||||
- **Title**: Hyperagents
|
||||
- **Authors**: Jenny Zhang, Bingchen Zhao, Wannan Yang, Jakob Foerster, Jeff Clune, Minqi Jiang, Sam Devlin, Tatiana Shavrina
|
||||
- **arXiv ID**: 2603.19461
|
||||
- **Submission Date**: 19 Mar 2026
|
||||
- **Subjects**: Artificial Intelligence (cs.AI)
|
||||
- **DOI**: https://doi.org/10.48550/arXiv.2603.19461
|
||||
- **Code**: https://github.com/facebookresearch/Hyperagents
|
||||
- **License**: Creative Commons Attribution 4.0 International
|
||||
|
||||
## Abstract
|
||||
|
||||
Self-improving AI systems aim to reduce reliance on human engineering by learning to improve their own learning and problem-solving processes. Existing approaches to self-improvement rely on fixed, handcrafted meta-level mechanisms, fundamentally limiting how fast such systems can improve. The Darwin Gödel Machine (DGM) demonstrates open-ended self-improvement in coding by repeatedly generating and evaluating self-modified variants. Because both evaluation and self-modification are coding tasks, gains in coding ability can translate into gains in self-improvement ability. However, this alignment does not generally hold beyond coding domains.
|
||||
|
||||
We introduce **hyperagents**, self-referential agents that integrate a task agent (which solves the target task) and a meta agent (which modifies itself and the task agent) into a single editable program. Crucially, the meta-level modification procedure is itself editable, enabling metacognitive self-modification, improving not only the task-solving behavior, but also the mechanism that generates future improvements.
|
||||
|
||||
We instantiate this framework by extending DGM to create DGM-Hyperagents (DGM-H), eliminating the assumption of domain-specific alignment between task performance and self-modification skill to potentially support self-accelerating progress on any computable task. Across diverse domains, the DGM-H improves performance over time and outperforms baselines without self-improvement or open-ended exploration, as well as prior self-improving systems. Furthermore, the DGM-H improves the process by which it generates new agents (e.g., persistent memory, performance tracking), and these meta-level improvements transfer across domains and accumulate across runs.
|
||||
|
||||
DGM-Hyperagents offer a glimpse of open-ended AI systems that do not merely search for better solutions, but continually improve their search for how to improve.
|
||||
|
||||
## Key Concepts
|
||||
|
||||
### 1. Hyperagents
|
||||
Self-referential agents that integrate task-solving and self-modification capabilities into a single editable program. The meta-level modification procedure is itself editable, enabling metacognitive self-modification.
|
||||
|
||||
### 2. Darwin Gödel Machine (DGM)
|
||||
A framework for open-ended self-improvement in coding domains, where both evaluation and self-modification are coding tasks, creating a natural alignment between task performance and self-improvement ability.
|
||||
|
||||
### 3. DGM-Hyperagents (DGM-H)
|
||||
Extension of DGM that eliminates the domain-specific alignment assumption, enabling self-accelerating progress on any computable task.
|
||||
|
||||
### 4. Metacognitive Self-Modification
|
||||
The ability to not only improve task-solving behavior but also improve the mechanism that generates future improvements.
|
||||
|
||||
### 5. Self-Accelerating Progress
|
||||
The property where improvements in problem-solving ability lead to improvements in self-improvement ability, creating a positive feedback loop.
|
||||
|
||||
## Methodology
|
||||
|
||||
### Framework Architecture
|
||||
1. **Integrated Program**: Single editable program containing both task agent and meta agent
|
||||
2. **Editable Meta-Level**: The modification procedure itself can be modified
|
||||
3. **Self-Referential Loop**: Improvements in task-solving → improvements in self-modification → further improvements in task-solving
|
||||
|
||||
### DGM-H Implementation
|
||||
- Extends the original DGM framework
|
||||
- Removes domain-specific alignment requirement
|
||||
- Supports persistent memory and performance tracking
|
||||
- Enables meta-level improvements to transfer across domains
|
||||
|
||||
## Results
|
||||
|
||||
### Performance Improvements
|
||||
- DGM-H improves performance over time across diverse domains
|
||||
- Outperforms baselines without self-improvement
|
||||
- Outperforms prior self-improving systems
|
||||
|
||||
### Meta-Level Improvements
|
||||
- Improves the process of generating new agents
|
||||
- Improvements transfer across domains
|
||||
- Improvements accumulate across runs
|
||||
|
||||
## Significance
|
||||
|
||||
### Theoretical Contribution
|
||||
- Introduces the concept of hyperagents as a general framework for self-improving AI
|
||||
- Demonstrates metacognitive self-modification as a key capability
|
||||
- Provides a path toward self-accelerating progress on arbitrary computable tasks
|
||||
|
||||
### Practical Implications
|
||||
- Potential for creating AI systems that continuously improve their own improvement processes
|
||||
- Reduces reliance on human engineering for meta-level design
|
||||
- Enables open-ended progress beyond fixed meta-level mechanisms
|
||||
|
||||
## Related Work
|
||||
- Darwin Gödel Machine (DGM)
|
||||
- Self-improving AI systems
|
||||
- Meta-learning and meta-reinforcement learning
|
||||
- Program synthesis and genetic programming
|
||||
|
||||
## References
|
||||
- arXiv:2603.19461 [cs.AI]
|
||||
- GitHub: https://github.com/facebookresearch/Hyperagents
|
||||
- DOI: https://doi.org/10.48550/arXiv.2603.19461
|
||||
|
||||
## Tags
|
||||
#hyperagents #self-improving-ai #darwin-godel-machine #metacognitive-self-modification #self-accelerating-progress #ai-research #meta-learning
|
||||
Reference in New Issue
Block a user