321

arXiv:2512.16294v1 Announce Type: new
Abstract: Semantic overlap among land-cover categories, highly imbalanced label distributions, and complex inter-class co-occurrence patterns constitute significant challenges for multi-label remote-sensing image retrieval. In this article, Multi-Label Adaptive…
330

arXiv:2504.11900v3 Announce Type: replace
Abstract: Stories are a fundamental aspect of human experience. Engaging deeply with stories and spotting plot holes -- inconsistencies in a storyline that break the internal logic or rules of a story's world -- requires nuanced reasoning skills, including …
319

arXiv:2512.12793v2 Announce Type: replace
Abstract: This paper presents Vision-Language Global Localization (VLG-Loc), a novel global localization method that uses human-readable labeled footprint maps containing only names and areas of distinctive visual landmarks in an environment. While humans n…
321

arXiv:2512.16755v1 Announce Type: new
Abstract: Vision-Language Models (VLMs) have made significant progress in explicit instruction-based navigation; however, their ability to interpret implicit human needs (e.g., "I am thirsty") in dynamic urban environments remains underexplored. This paper intr…
321

arXiv:2512.16913v1 Announce Type: new
Abstract: In this work, we present a panoramic metric depth foundation model that generalizes across diverse scene distances. We explore a data-in-the-loop paradigm from the view of both data construction and framework design. We collect a large-scale dataset b…
330

arXiv:2512.15925v1 Announce Type: new
Abstract: Reading stories evokes rich interpretive, affective, and evaluative responses, such as inferences about narrative intent or judgments about characters. Yet, computational models of reader response are limited, preventing nuanced analyses. To address t…
329

arXiv:2507.05859v3 Announce Type: replace
Abstract: Free-Viewpoint Video (FVV) enables immersive 3D experiences, but efficient compression of dynamic 3D representation remains a major challenge. Existing dynamic 3D Gaussian Splatting methods couple reconstruction with optimization-dependent compres…
310

arXiv:2511.07503v3 Announce Type: replace
Abstract: The increased availability of genetic data has transformed genomics research, but raised many privacy concerns regarding its handling due to its sensitive nature. This work explores the use of language models (LMs) for the generation of synthetic …
243

arXiv:2512.16233v1 Announce Type: cross
Abstract: We address network structure learning from zero-inflated count data by casting each node as a zero-inflated generalized linear model and optimizing a smooth, score-based objective under a directed acyclic graph constraint. Our Zero-Inflated Continuo…
222

arXiv:2512.15738v1 Announce Type: new
Abstract: Financial market prediction is a challenging application of machine learning, where even small improvements in directional accuracy can yield substantial value. Most models struggle to exceed 55--57\% accuracy due to high noise, non-stationarity, and …
210

arXiv:2512.16251v1 Announce Type: cross
Abstract: We introduce the \textit{Consensus-Bottleneck Asset Pricing Model} (CB-APM), a partially interpretable neural network that replicates the reasoning processes of sell-side analysts by capturing how dispersed investor beliefs are compressed into asset…
120

arXiv:2503.08453v2 Announce Type: replace
Abstract: The new class of alternating-conjugate splitting methods is presented and analyzed. They are obtained by concatenating a given composition involving complex coefficients with the same composition but with the complex conjugate coefficients. We sho…
122

arXiv:2512.16247v1 Announce Type: new
Abstract: One of many impediments to applying graph neural networks (GNNs) to large-scale real-world graph data is the challenge of centralized training, which requires aggregating data from different organizations, raising privacy concerns. Federated graph lea…
111

arXiv:2512.16908v1 Announce Type: new
Abstract: We investigate the problem of identifying objects that have been added, removed, or moved between a pair of captures (images or videos) of the same scene at different times. Detecting such changes is important for many applications, such as robotic ti…
109

arXiv:2507.15251v2 Announce Type: replace
Abstract: Large Language Models (LLMs) have shown great potential in Automated Program Repair (APR). Test inputs, being crucial for reasoning the root cause of failures, are always included in the prompt for LLM-based APR. Unfortunately, LLMs struggle to re…