109

arXiv:2503.15770v2 Announce Type: replace-cross
Abstract: Depth foundation models offer strong learned priors for 3D perception but lack physical depth cues, leading to ambiguities in metric scale. We introduce a birefringent metalens -- a planar nanophotonic lens composed of subwavelength pixels for wavefront shaping with a thickness of 700 nm and a diameter of 3 mm -- to physically prompt depth foundation models. In a single monocular shot, our metalens physically embeds depth information into two polarized optical wavefronts, which we decode through a lightweight prompting and fine-tuning framework that aligns depth foundation models with the optical signals. To scale the training data, we develop a light wave propagation simulator that synthesizes metalens responses from RGB-D datasets, incorporating key physical factors to minimize the sim-to-real gap. Simulated and physical experiments with our fabricated titanium-dioxide metalens demonstrate accurate and consistent metric depth over state-of-the-art monocular depth estimators. The research demonstrates that nanophotonic wavefront formation offers a promising bridge for grounding depth foundation models in physical depth sensing.
Be respectful and constructive. Comments are moderated.

No comments yet.