SidneyZhang/myWiki

Fork 0

Files

Sidney Zhang 91fac5b6fc

20260617:目前有914 页

2026-06-17 15:02:40 +08:00

1.9 KiB

Raw Blame History

title, source_url, ingested, sha256

title	source_url	ingested	sha256
A Geometric View for Understanding Concept Learning and Neuron Interpretation in Sparse Autoencoders	https://arxiv.org/abs/2606.07007	2026-06-17	<computed>

A Geometric View for Understanding Concept Learning and Neuron Interpretation in Sparse Autoencoders

Authors: Chenhao Zhang, Chris Lin, Su-In Lee — University of Washington, Paul G. Allen School of CSE

arXiv: 2606.07007v1 [cs.LG] (2026-06-05)

Published: Preprint, June 8, 2026

Abstract

A unified mathematical framework for geometric understanding of concept learning and neuron interpretation in sparse autoencoders (SAEs). Formalizes concepts as sets of data points and casts concept learning as a set-alignment problem between human-defined and model-induced concepts. Distinguishes three increasingly strong notions of learning — detection, separation, and approximation — and yields geometric conditions, error bounds, and capacity constraints. Provides a set-theoretic account for SAE phenomena including feature splitting, feature absorption, feature families, and hierarchical concepts. Connects concept learning and neuron interpretation through formal concept analysis, showing that the two directions need not agree and their many-to-many structure can be organized by concept lattices.

1.9 KiB Raw Blame History

A Geometric View for Understanding Concept Learning and Neuron Interpretation in Sparse Autoencoders

Abstract

Key Concepts

1.9 KiB

Raw Blame History