221

arXiv:2512.12132v1 Announce Type: new
Abstract: This article establishes a comprehensive theoretical framework demonstrating that SiLU (Sigmoid Linear Unit) activation networks achieve exponential approximation rates for smooth functions with explicit and improved complexity control compared to classical ReLU-based constructions. We develop a novel hierarchical construction beginning with an efficient approximation of the square function $x^2$ more compact in depth and size than comparable ReLU realizations, such as those given by Yarotsky. This construction yields an approximation error decaying as $\mathcal{O}(\omega^{-2k})$ using networks of depth $\mathcal{O}(1)$. We then extend this approach through functional composition to establish sharp approximation bounds for deep SiLU networks in approximating Sobolev-class functions, with total depth $\mathcal{O}(1)$ and size $\mathcal{O}(\varepsilon^{-d/n})$.