0
Determinant-Based Error Bounds for CUR Matrix Approximation: Oversampling and Volume Sampling
arXiv:2512.15102v1 Announce Type: new
Abstract: We derive error bounds for CUR matrix approximation using determinant-based methods that relate local projection errors to global approximation quality. For general matrices, we establish determinant identities for bordered Gramian matrices that decompose CUR approximation errors into interpretable local components. These identities connect projection errors onto submatrix column spaces directly to determinants, providing geometric insight into approximation degradation. We develop a probabilistic framework based on volume sampling that yields interpolation-type error bounds quantifying the benefits of oversampling: when $r > k$ rows are selected for $k$ columns, the expected error factor transitions linearly from $(k+1)^2$ (no oversampling) to $(k+1)$ (full oversampling). Our analysis establishes that the expected squared error is bounded by this interpolation factor times the squared error of the best rank-$k$ approximation, directly connecting CUR approximation quality to the optimal low-rank approximation. The framework applies to both CUR decomposition for general matrices and the Nystr\"om method for symmetric positive semi-definite matrices, providing a unified theoretical foundation for determinant-based low-rank approximation analysis.
Abstract: We derive error bounds for CUR matrix approximation using determinant-based methods that relate local projection errors to global approximation quality. For general matrices, we establish determinant identities for bordered Gramian matrices that decompose CUR approximation errors into interpretable local components. These identities connect projection errors onto submatrix column spaces directly to determinants, providing geometric insight into approximation degradation. We develop a probabilistic framework based on volume sampling that yields interpolation-type error bounds quantifying the benefits of oversampling: when $r > k$ rows are selected for $k$ columns, the expected error factor transitions linearly from $(k+1)^2$ (no oversampling) to $(k+1)$ (full oversampling). Our analysis establishes that the expected squared error is bounded by this interpolation factor times the squared error of the best rank-$k$ approximation, directly connecting CUR approximation quality to the optimal low-rank approximation. The framework applies to both CUR decomposition for general matrices and the Nystr\"om method for symmetric positive semi-definite matrices, providing a unified theoretical foundation for determinant-based low-rank approximation analysis.