One-Step Distillation of Discrete Diffusion Image Generators via Fixed-Point Iteration

Published: 2026-05-20 17:59:10

Authors: Chaoyang Wang, Yunhai Tong

Categories: cs.CV

Abstract:
Discrete diffusion models excel at visual synthesis but rely on slow, iterative decoding. Existing single-step distillation methods attempt to bypass this bottleneck, either by training auxiliary score networks that effectively double compute, or by introducing specialized parameterizations and multi-stage pipelines that fragment optimization. In this paper, we introduce Fixed-Point Distillation (FPD), an end-to-end framework that constructs local correction targets by partially corrupting the student's one-step draft and refining it with a single teacher step. To compute the training objective in a semantically meaningful space, we lift discrete tokens into continuous features and apply a multi-bandwidth drift loss that iteratively accumulates these corrections. To backpropagate through the discrete bottleneck, we employ a straight-through estimator that feeds exact hard-sampled tokens to the teacher and decoder during the forward pass, ensuring that training and inference operate on the same codebook manifold, while routing continuous gradients back to the student logits. This fully differentiable pathway additionally accommodates an optional unconditional adversarial objective to enhance perceptual realism. Evaluations on both class- and text-conditional generation validate the effectiveness of our framework. FPD achieves competitive visual fidelity and structural alignment within a single inference step, narrowing the gap to multi-step teachers while outperforming existing discrete distillation baselines.

arXiv Page | PDF

Score: 0

Assessing the impact of tourist attractions through the integration of causal inference and demand-side economic analysis: A case study of the Sensoria experience museum in Holzminden, Germany

Published: 2026-05-20 17:51:42

Authors: Thomas Wieland

Categories: stat.AP

Abstract:
This research note investigates the impact of the experience museum Sensoria, opened in September 2024 in Holzminden, Germany, on local tourism demand and related direct and indirect effects. To this end, the study employs a novel approach by combining causal inference and demand-side economic analysis. A difference-in-differences approach is employed to quantify the number of additional guest overnight stays in the treatment city; the results are converted into industry-specific expenditures, from which the direct and indirect effects of Sensoria are determined. A positive and significant impact which corresponds to 4,691 additional overnight stays can be detected in the first year of operation of the new tourist attraction, resulting in an additional gross turnover of approximately 0.56 million EUR across the hospitality and retail industries and other services. The direct effects and indirect effects amount to approximately 0.23 and 0.21 million EUR, respectively. However, long-term effects cannot (yet) be determined. Additionally, positive effects from small and large events in the cities studied can be demonstrated. This brief study demonstrates that combining the two approaches mentioned holds promise, yet requires a more in-depth analysis, for which suggestions are also discussed regarding how it could be conducted.

arXiv Page | PDF

Score: 0

Mem-$π$: Adaptive Memory through Learning When and What to Generate

Published: 2026-05-20 17:51:05

Authors: Xiaoqiang Wang, Chao Wang, Hadi Nekoei, Christopher Pal, Alexandre Lacoste, Spandana Gella, Bang Liu, Perouz Taslakian

Categories: cs.CL, cs.AI

Abstract:
We present Mem-$π$, a framework for adaptive memory in large language model (LLM) agents, where useful guidance is generated on demand rather than retrieved from external memory stores. Existing memory-augmented agents typically rely on similarity-based retrieval from episodic memory banks or skill libraries, returning static entries that often misalign with the current context. In contrast, Mem-$π$ uses a dedicated language or vision-language model with its own parameters, separate from the downstream agent, to generate context-specific guidance for complex tasks. Conditioned on the current agent context, the model jointly decides when to produce guidance and what guidance to produce. We train it with a decision-content decoupled reinforcement learning (RL) objective, enabling it to abstain when generation would not help and otherwise produce concise, useful guidance. Across diverse agentic benchmarks spanning web navigation, terminal-based tool use, and text-based embodied interaction, Mem-$π$ consistently outperforms retrieval-based and prior RL-optimized memory baselines, achieving over 30% relative improvement on web navigation tasks.

arXiv Page | PDF

Score: 0

Network evolution with self-reinforcement

Published: 2026-05-20 17:48:59

Authors: Shankar Bhamidi, Remco van der Hofstad, Frank den Hollander, Rounak Ray

Categories: math.PR

Abstract:
We study a new class of preferential attachment trees with \emph{self-reinforcement}. At each time, each vertex is assigned a weight equal to the cumulative sum over past times of an affine function of its degree. A new vertex attaches itself via a single edge to an already present vertex with a probability proportional to the current weight of that vertex. This ``integrated popularity'' rule builds long memory directly into the attachment mechanism, thereby destroying the Markov and partial-exchangeability features that underlie the classical analysis of affine preferential attachment models. More broadly, the model connects to applied-probability work on long-memory self-interacting processes (such as the elephant random walk), emphasizing how non-Markovian reinforcement reshapes asymptotic behaviour. Despite this loss of structure, we identify an explicit exponent $φ=φ(δ)$ governing both local and global growth: typical degrees at time $n$ scale as $n^{1/φ}$, and the empirical degree distribution converges to a power-law with a tail exponent $φ+1$. We further prove Benjamini--Schramm local convergence to an infinite random rooted tree characterized via an embedded continuous-time branching process. The limiting tree is a \texttt{sin}-tree, and is \emph{not} the Pólya-type limiting tree arising in the non-reinforced setting. Our results provide a tractable probabilistic description of a natural ``memoryful'' network-growth mechanism, and quantify precisely how reinforcement renormalizes the classical preferential-attachment exponents.

arXiv Page | PDF

Score: 0

Quality and Security Signals in AI-Generated Python Refactoring Pull Requests

Published: 2026-05-20 17:43:36

Authors: Mohamed Almukhtar, Anwar Ghammam, Hua Ming

Categories: cs.SE, cs.AI

Abstract:
As AI agents increasingly contribute to code development and maintenance, there is still limited empirical evidence on the quality and risk characteristics of their changes in real-world projects, particularly for refactoring-oriented contributions. It remains unclear how agent-authored refactoring edits affect maintainability, code quality, and security once merged into GitHub repositories. To address this gap, we conduct an empirical study of Python refactoring pull requests (PRs) from the AIDev dataset. We analyze agentic refactoring PRs using PyQu, an ML-based quality assessment tool for Python, to quantify changes across five quality attributes, and we complement PyQu with domain-independent static analysis (Pylint and Bandit) to measure code quality and security issues before and after each change. Our results show that, on average, agentic commits improve a quality attribute in 22.5% of the studied changes, with usability improving most frequently (36.5%). At the same time, 24.17% of modified files introduce new Pylint issues predominantly convention level violations such as long lines-while 4.7% introduce new Bandit findings. From the observed diffs, we derive a taxonomy of 24 recurring change operations and map them to the lint and security findings they most commonly affect. Despite these mixed outcomes, developer acceptance is high: 73.5% of the analyzed PRs are merged, including cases that introduce new lint or security findings, often alongside the removal of existing issues. Overall, these findings highlight both the promise and current limitations of agentic refactoring, and motivate stronger tool-in-the-loop quality and security gating for AI-driven development workflows.

arXiv Page | PDF

Score: 0

A Note on EFX Inapproximability for Chores

Published: 2026-05-20 17:35:44

Authors: Vasilis Christoforidis

Categories: cs.GT

Abstract:
We study the approximability of EFX allocations for indivisible chores under complement-free cost functions. The non-existence of exact EFX allocations for general monotone functions for chores is known from \cite{CS24}, and a result of \cite{akrami2026} transfers such comparison-based non-existence results to monotone submodular, and hence subadditive, functions. We strengthen this picture by giving explicit constant-factor inapproximability results for submodular and subadditive functions. Our main construction is a three-agent, six-chore instance with monotone subadditive cost functions for which no $α$-EFX allocation exists for any $1\le α<2^{1/3}\approx 1.26$, thus narrowing the gap with the known upper bound of $2$. The construction is obtained by refining the original counterexample of \cite{CS24} and using the approach of \cite{mackenzie2026}. We also give a weighted-coverage realization of the ordinal profile, yielding an instance in which no $α$-EFX allocation exists for any $1\le α<20/19$ under submodular costs. Thus, even within well-studied complement-free classes, EFX for chores admits nontrivial constant lower bounds on approximability.

arXiv Page | PDF

Score: 0

A Compression-Directional Entropic Stress Method for Shock-Regularized Compressible Flow

Published: 2026-05-20 17:32:33

Authors: Bonan Xu, Chihyung Wen

Categories: physics.flu-dyn

Abstract:
We introduce the Compression-Directional Entropic Stress method (CoDeS), a finite-volume regularization for shock-dominated compressible flows. Inspired by information geometric regularization, CoDeS replaces scalar multidimensional entropic pressure with a tensor stress aligned with the principal directions of compression. The stress has the form $\boldsymbolΠ_Σ=σ\boldsymbol{M}$, where $σ$ is obtained from a modified-Helmholtz equation and $\boldsymbol{M}$ is constructed from the compressive eigenspace of the symmetric velocity-gradient tensor. The source is gated by volumetric and principal-strain compression, so the regularization vanishes in smooth expansion, rigid-body rotation, and ideal contacts, while recovering the compressive one-dimensional IGR mechanism at planar shocks. The same tensor stress is used in the conservative momentum flux and the stress-work energy flux. CoDeS is tested on one-, two-, and three-dimensional problems including smooth expansion, double rarefaction, the Sod shock tube, multidimensional Riemann flow, a viscous shock tube, a two-fluid triple point, a Mach-3 slot jet, and a supersonic Taylor--Green vortex. The results show that CoDeS remains inactive in expansive and contact regions, supplies localized stress at shocks, and concentrates regularization along compressive wave structures while remaining weak in shear- and vorticity-dominated regions. At matched resolutions, the three-dimensional Taylor--Green results are comparable to or more energetic than seventh-order WENO/TENO references. These results indicate that CoDeS provides a compression-selective shock regularization compatible with high-order finite-volume resolution of contacts, interfaces, shear layers, and vortical structures.

arXiv Page | PDF

Score: 0

On the Regularity and Generalization of One-Step Wasserstein-guided Generative Models for PDE-Induced Measures

Published: 2026-05-20 16:43:55

Authors: Likun Lin, Zhongjian Wang, Jack Xin, Zhiwen Zhang

Categories: cs.LG, cs.AI, math.NA, stat.ML

Abstract:
Despite the remarkable empirical success of generative models, the available theory on their statistical accuracy in scientific computing remains largely pessimistic. This paper develops a theoretical framework for understanding the regularity of transport maps and the generalization properties of one-step Wasserstein-guided generative models for PDE-induced probability measures. We consider normalized target densities associated with linear elliptic and parabolic equations on bounded domains, as well as diffusion and Fokker--Planck equations on the torus. Under standard structural assumptions, we prove that these target measures satisfy doubling conditions. By combining this fact with regularity theory for optimal transport between doubling measures, we show that the optimal transport map from a uniform source measure to the target measure is Hölder continuous. This regularity yields an approximation-theoretic justification for one-step generative models that learn PDE-induced distributions via a single pushforward map. As a representative instance, we study DeepParticle and derive excess-risk bounds characterizing the discrepancy between the learned map and the population-optimal map. We also establish a robustness estimate under target shift and illustrate the theory with experiments which support the derived rates.

arXiv Page | PDF

Score: 0

EllipseLIO: Adaptive LiDAR Inertial Odometry with an Ellipsoid Representation

Published: 2026-05-20 13:24:58

Authors: Rowan Border, Margarita Chli

Categories: cs.RO

Abstract:
LiDAR Inertial Odometry (LIO) is a critical component for many mobile robots that need to navigate without relying on external positioning (e.g., GPS). Platforms that operate autonomously in different environments and with heterogeneous LiDAR sensors require a LIO approach that can adapt to these different scenarios without human intervention. Existing LIO approaches can typically provide reliable and accurate odometry in scenarios with similar environments and sensors when suitably tuned. However, many approaches struggle to retain robust odometry across heterogeneous environments and sensors while using a consistent configuration. This paper presents EllipseLIO, a real-time LIO approach that generalises between scenarios by using methods for LiDAR scan filtering and registration that adapt to the sensor capabilities and environment without requiring scenario-specific tuning. Experiments with EllipseLIO and state-of-the-art LIO approaches on five datasets with diverse and challenging scenarios demonstrate that EllipseLIO is the best-performing approach overall. It achieves a 38% lower odometry error on average than the second-best approach and is the only approach that does not diverge in any experiment. An open-source version of EllipseLIO will be available at github.com/v4rl-ucy/ellipselio.

arXiv Page | PDF

Score: 0

Cloud-Native Operation of Roadside Infrastructure Enabling Demand-Driven Collective Perception via V2X

Published: 2026-05-20 13:18:59

Authors: Lukas Zanger, Fabian Thomsen, Guido Linden, Jean-Pierre Busch, Lennart Reiher, Lutz Eckstein

Categories: cs.DC, cs.AR

Abstract:
Intelligent roadside infrastructure is a key enabler for cooperative intelligent transport systems (C-ITS), supporting vehicles equipped with automated driving systems (ADS), e.g., through enhanced environment perception. With a growing number and an expanding functional scope of roadside units, scalable and efficient operation becomes a challenge. This paper presents a cloud-native architecture for the operation of distributed roadside infrastructure based on a Kubernetes cluster spanning roadside units and a cloud server. Building on this architecture, a demand-driven orchestration approach is implemented to dynamically deploy resource-intensive services only when required. As a representative use case, a V2X-based collective perception application is deployed on-demand when a connected vehicle is nearby. The approach is validated in a real-world experiment in our test field in Aachen, demonstrating that the collective perception application starts in time for the vehicle to benefit from it. Without any demand, the application remains inactive, reducing energy consumption, channel congestion, and hardware wear. Beyond the primary evaluation, V2X recordings from the test field are analyzed to estimate the energy-saving potential of demand-driven operation. In summary, the results demonstrate the practical feasibility of cloud-native, demand-driven operation of roadside infrastructure and indicate its potential to improve scalability and (energy) efficiency in future C-ITS deployments.

arXiv Page | PDF

Score: 0

ROAR-3D: Routing Arbitrary Views for High-Fidelity 3D Generation

Published: 2026-05-20 12:50:52

Authors: Hanxiao Sun, Mingxin Yang, Shuhui Yang, Zebin He, Xintong Han, Hongbo Fu, Chunchao Guo, Wenhan Luo

Categories: cs.CV, cs.GR

Abstract:
Single-image-to-3D generative models can now produce high-quality geometry, yet conditioning on a single view inevitably introduces ambiguity about unseen regions. Multi-view conditioning can reduce this ambiguity, but existing methods either require fixed canonical viewpoints or rely on external reconstruction modules that impose heavy training costs and limit generation quality. We observe that pretrained single-view models already possess strong 2D-to-3D grounding that can be reused for multi-view conditioning. However, a closer analysis reveals that their conditioning mechanism entangles orientation control with geometry transfer, two functions that conflict when images from different viewpoints are naively combined. Based on this analysis, we propose ROAR-3D, a lightweight method that upgrades a pretrained single-view model to accept an arbitrary number of unposed images. A token-wise view router assigns each 3D latent token to its most relevant view, implicitly establishing 2D-to-3D correspondences without explicit pose input. A dual-stream attention design preserves the pretrained primary-view behavior while routing auxiliary views through a separate path dedicated to geometric enrichment. An orientation perturbation strategy ensures the auxiliary path learns orientation-independent geometry transfer. These components introduce minimal trainable parameters and add negligible inference overhead relative to the single-view baseline. ROAR-3D achieves state-of-the-art multi-view 3D generation quality and supports test-time view scaling from 1 to 12+ views with consistent improvements.

arXiv Page | PDF

Score: 0

Image Encryption via Data-Identified Discrete Chaotic Maps

Published: 2026-05-20 12:49:17

Authors: Wenyuan Lia, Xiao-Yun Wang, Zhigang Zhu, Xiaofeng Zhang, Li Zhang

Categories: cs.CR

Abstract:
In this work, we propose a data-driven image encryption framework that identifies chaotic maps directly from data using the SINDy-PI algorithm. Unlike conventional encryption schemes relying on predefined maps, our method learns the full explicit dynamics -- including cross-terms and higher-order nonlinearities -- from observational data. The validity of this approach is verified on three distinct chaotic systems: the H{é}non map, the three-dimensional logistic map, and the piecewise-linear Lozi map, demonstrating its generality. The encryption key consists solely of initial conditions; the map structure itself becomes data-dependent, introducing an extra layer of security. Moreover, even when the initial conditions are fixed, different training data (e.g., with a tiny noise seed) lead to slightly different maps, which produce completely different ciphertexts (NPCR $\approx 99.6\%$, UACI $\approx 33.5\%$). Numerical experiments on the H{é}non system show near-ideal information entropy ($\approx 8$ bits), negligible inter-pixel correlation, and extreme sensitivity to initial conditions: a perturbation of $10^{-16}$ causes total decryption failure. The scheme resists both differential and statistical attacks, with NPCR and UACI values matching theoretical ideals. Our results establish a new paradigm for chaos-based cryptography beyond fixed maps.

arXiv Page | PDF

Score: 0

Automated Byzantine-Resilient Clustered Decentralized Federated Learning for Battery Intelligence in Connected EVs

Published: 2026-05-20 12:47:14

Authors: Mouhamed Amine Bouchiha, Abdelaziz Amara Korba, Yacine Ghamri-Doudane

Categories: cs.DC, cs.LG

Abstract:
Federated learning (FL) has emerged as a promising paradigm for managing electric vehicle (EV) battery data in intelligent transportation systems (ITS), enabling privacy-preserving tasks such as anomaly detection and capacity estimation. However, most existing frameworks rely on centralized aggregation schemes, which pose critical limitations in terms of security and trust. To address these challenges, we propose ABC-DFL, an automated Byzantine-resilient clustered decentralized federated learning (C-DFL) framework for connected EVs. The proposed incentive-driven C-DFL system replaces the central server with an open-permissioned blockchain, featuring a new dynamic Quorum Byzantine Fault Tolerance (QBFT) protocol and an oracle-based aggregation layer, to enhance trust, security, and automation. At the core of ABC-DFL lies FLECA (Filtered Layered Enhanced Clustering Aggregation), a robust hierarchical aggregation protocol that mitigates Byzantine attacks by having each EV filter malicious updates using an adaptive threshold based on deviations from its reference model update. Oracle nodes, responsible for inter-group aggregation, employ robust clustering to isolate and aggregate model updates from trustworthy EV groups. Comprehensive experimental evaluations demonstrate that FLECA matches FedProx convergence under benign conditions and significantly outperforms existing defenses with attack impact scores below 0.10 in adaptive adversarial scenarios. Furthermore, several learning experiments with multitask models confirm the effectiveness and fairness of the incentive mechanism. Finally, on-chain and off-chain benchmarks validate the practicality of ABC-DFL.

arXiv Page | PDF

Score: 0

UOTIP: Unbalanced Optimal Transport Map for Unpaired Inverse Problems

Published: 2026-05-20 12:25:26

Authors: Donggyu Lee, Taekyung Lee, Jaewoong Choi

Categories: cs.LG

Abstract:
We investigate unpaired image inverse problems, a challenging setting where only independent, non-paired sets of noisy measurements and clean target signals are available for training. We propose a novel inverse problem solver based on Unbalanced Optimal Transport, called Unbalanced Optimal Transport Map for Inverse Problems (UOTIP). Our method formulates the reconstruction task, predicting clean target signals from noisy measurements, as learning a UOT Map from noisy measurement distribution to clean signal distribution by incorporating a likelihood-based cost function. By relaxing the exact marginal constraint, the UOT framework provides key advantages to our model: robustness to multi-level observation noise, adaptability to class imbalance between noisy and clean datasets, and generalizability to diverse noise-type scenarios. Furthermore, we theoretically demonstrate that incorporating a quadratic cost term ensures the existence and uniqueness of the transport map by satisfying the twist condition, even for ill-posed inverse problems. Our experiments demonstrate that UOTIP achieves state-of-the-art performance on unpaired image inverse problem benchmarks, across linear and nonlinear inverse problems.

arXiv Page | PDF

Score: 0

Large-space and Large-time Asymptotics for the Focusing Nonlinear Schrödinger Soliton Gas

Published: 2026-05-20 12:22:55

Authors: Dedi Yan, Xianguo Geng, Wei Jiao

Categories: nlin.SI

Abstract:
We investigate the large-space and large-time asymptotic behavior of a soliton gas for the focusing nonlinear Schrödinger equation. The soliton gas is constructed as the continuum limit of pure $N$-soliton solutions as $N\to\infty$, with the discrete spectrum confined to two segments $Σ_1$ and $Σ_2$. In particular, our framework does not require the discrete spectrum to be confined to the imaginary axis. By combining the nonlinear steepest descent method with an appropriate $g$-function mechanism, we show that, as $x\to-\infty$, the soliton gas is asymptotically described by a finite-gap elliptic solution with constant coefficients. In the large-time regime $t\to+\infty$, we assume that the endpoint $F$ lies on the trajectory of $H(ξ)$ with $ξ=\frac{x}{2t}\in(-E_1-\sqrt{2}E_2,-E_1)$, namely, $F=H(\hatξ)$, $\hatξ\in (-E_1-\sqrt{2}E_2,-E_1)$. Under this assumption, we prove that the solution exhibits distinct asymptotic behaviors in different regions of the variable $ξ=\frac{x}{2t}$. More precisely, there exist an exponentially decaying region $ξ\in(-E_1,+\infty)$, a modulated elliptic-wave region $ξ\in(\hatξ,-E_1)$, and an unmodulated elliptic-wave region $ξ\in(-\infty,\hatξ)$.

arXiv Page | PDF

Score: 0

AIMBio-Mat: An AI-Native FAIR Platform for Closed-Loop Materials Discovery and Biomedical Translation

Published: 2026-05-20 12:18:49

Authors: D. -M. Mei, K. Acharya, C. M. Adhikari, M. Adhikari, S. Aryal, B. V. Benson, K. Bhatta, S. Bhattarai, N. Budhathoki, A. M. Castillo, D. Chakraborty, S. Chhetri, S. Choudhury, T. A. Chowdhury, R. D. Cruz, B. Cui, S. Dhital, K. -M. Dong, R. Gapuz, A. Ghasemi, E. Z. Gnimpieba, B. D. S. Gurung, H. A. Hashim, R. I. Harry, K. -E. Hasin, M. K. Hassanzadeh, M. K. Jha, D. Kim, K. -C. Kong, B. Lama, A. Mahat, N. Maharjan, A. Majeed, J. Mammo, M. M. Masud, K. S. Moore, A. Nawaz, H. Oli, S. A. Panamaldeniya, L. Pandey, R. Pandey, Z. Peng, A. Prem, M. M. Rana, K. Rana Magar, R. Rizk, C. S. Tadi, L. -W. Wang, Y. Yang, G. -L. Yin, C. -X. Yu, D. Zeng, M. Zhou, Q. Zhou

Categories: physics.app-ph, cs.LG, physics.bio-ph, physics.med-ph

Abstract:
Materials discovery and biomedical translation increasingly require models that can reason across composition, processing, structure, biological response, manufacturability, safety, and governance constraints. Existing materials and biomedical data ecosystems are powerful but remain poorly coupled for AI-guided discovery. Here we present AIMBio, a conceptual framework for an AI-native, FAIR, and governance-aware decision layer that links materials provenance, biomedical context, knowledge graphs, uncertainty-aware machine learning, and human-in-the-loop active learning. The framework formulates biomedical-materials discovery as constrained multi-objective optimization under uncertainty and introduces practical requirements for metadata, model documentation, risk-tiered governance, evaluation metrics, and phased implementation. To make the roadmap testable, we add a minimum viable prototype specification and a worked pilot for AI-guided nanomaterials for drug delivery. AIMBio is positioned as exploratory and preclinical discovery infrastructure, not as clinical decision-support software; any clinical or regulated-device use would require separate validation, change control, and regulatory review. The central contribution is a publishable platform blueprint for converting fragmented materials and biomedical records into auditable, experimentally actionable, and translationally responsible discovery workflows.

arXiv Page | PDF

Score: 0

Musical Attention Transformer: Music Generation Using a Music-Specific Attention Model

Published: 2026-05-20 12:16:28

Authors: Shinnosuke Taksuka, Hideo Mukai

Categories: cs.SD, cs.LG

Abstract:
This study aims to enhance the quality of music generation using Transformers by incorporating meta-information. While Transformer-based approaches are effective at capturing long-term dependencies in musical compositions, the music they generate often suffers from issues such as excessive repetition or duplication of notes, leading to unnatural melodies. To address these limitations, we propose Musical Attention, a mechanism that incorporates meta-information such as bar numbers, key, signatures, and tempos into the attention process. Musical Attention explicitly leverages both the structural properties of music and its associated metadata, enabling the Transformer's attention mechanism to operate more effectively and thereby improving the quality of the generated output. In our framework, each musical note is represented as a combination of five events-pitch, bar number, onset, duration, and velocity in addition to the three metadata elements. The attention mechanism is then modified to reflect the correlations among these eight features, allowing the model to better capture the inherent characteristics of musical composition. Experimental results demonstrate that the model incorporating Musical Attention outperforms prior methods, such as Full Attention and Strided Attention, in terms of musical coherence, variation, and overall quality. Notably, it significantly reduces repetition and enhances the model's ability to generate diverse, harmonically consistent melodies. Musical Attention thus represents a meaningful advancement in AI-driven music generation, facilitating the creation of more natural and expressive compositions.

arXiv Page | PDF

Score: 0

Towards Understanding Self-Pretraining for Sequence Classification

Published: 2026-05-20 11:56:15

Authors: Omar Coser, Loredana Zollo, Paolo Soda, Antonio Orvieto

Categories: cs.LG

Abstract:
Amos et al. (2024) showed that the accuracy of Transformer models in sequence classification can be significantly improved by first pretraining with a masked token prediction objective without external data or augmentation, a procedure referred to as self-pretraining (SPT). While the primary objective of Amos et al. (2024) was to showcase that Transformers can achieve strong performance on the Long-Range Arena (LRA), their pipeline raises more fundamental questions: How does SPT drive optimization to better solutions? Why can standard supervised training fail in Transformers? To better understand this, we replicate and systematically ablate the findings of Amos et al. (2024). Our ablations suggest that a central bottleneck in the studied settings is not depth or generalization alone, but the ability of label supervision to learn useful query-key Attention patterns from random initialization. With a minimal setup, we identify learning proximity interactions - turning absolute positional encodings into proximity-biased Attention scores - as a key source of the improvements brought by SPT. Finally, in a simplified theoretical setup, we show that label supervision can be locally blind to certain Attention-score directions that are instead detectable through masked reconstruction.

arXiv Page | PDF

Score: 0

A Dialogue between Causal and Traditional Representation Learning: Toward Mutual Benefits in a Unified Formulation

Published: 2026-05-20 11:43:46

Authors: Yan Li, Yuewen Sun, Shaoan Xie, Gongxu Luo, Yunlong Deng, Kun Zhang, Guangyi Chen

Categories: cs.LG

Abstract:
Causal representation learning (CRL) and traditional representation learning have largely developed along different trajectories. Traditional representation learning has been driven mainly by applications and empirical objectives, whereas CRL has focused more on theoretical questions, particularly identifiability. This difference in emphasis has created a gap between the two fields in terminology, problem formulation, and evaluation, limiting communication and sometimes leading to disconnected or redundant efforts. In this paper, we argue that these two fields should be brought into dialogue rather than treated as separate paradigms. To this end, we introduce a unified formulation in which the representation learning is characterized by two components: a task component, which specifies what information the learned representation is required to preserve, and a constraint component, which specifies what structure is imposed on the latent space. Under this formulation, the benefits run in both directions. CRL provides theoretical tools for understanding when structured latent constraints are useful or necessary, while traditional representation learning offers practical insights on task design and objective choice that can improve the development of CRL methods. To illustrate this interaction, we experimentally study how different task components affect the behavior of CRL methods under different structured constraints. Results on CausalVerse show that the effectiveness of causal constraints depends strongly on the tasks with which they are paired.

arXiv Page | PDF

Score: 0

On Unified and Sharpened CMI Bounds for Generalization Errors

Published: 2026-05-20 11:42:35

Authors: Yang Lu, Matthias Frey, Margreta Kuijper, Jingge Zhu

Categories: cs.IT

Abstract:
We present a new family of information-theoretic generalization bounds within the framework of conditional mutual information (CMI). Most of our results are established based on the leave-$m$-out (L$m$O) cross-validation error, with $m$ denoting the number of the hold-out supersamples. Under this setting, we propose a unified CMI-based bound, allowing to envelop and reproduce many known CMI-based bounds and also bridge the gap between the MI- and CMI-based bounds when $m$ tends to infinity. The proposed framework not only provides a unified description of the existing bounds but also develops new, sharper bounds. We show the benefits of the proposed bounds through several simple examples, where the existing results are either inapplicable or looser. Moreover, under the premise that the loss function is bounded, we tighten the CMI quantities involved in the proposed bounds by reducing the number of conditional terms, thereby enhancing the proposed framework. We show empirically that the resulting new bounds improve upon the previously known ones.

arXiv Page | PDF

Score: 0

Study of the thermodynamic properties of hot QCD matter with the CMS experiment

Published: 2026-05-20 11:37:08

Authors: Cesar A. Bernardes

Categories: nucl-ex, hep-ex

Abstract:
These proceedings summarize recent CMS measurements at the LHC that extract the squared speed of sound, $c_s^2$, of strongly interacting matter at extreme temperatures from the multiplicity dependence of the mean transverse momentum in ultra-central lead-lead (PbPb) collisions at $\sqrt{s_{\mathrm{NN}}} = 5.02\ \mathrm{TeV}$. The analysis yields $c_s^2 = 0.241 \pm 0.002\, (\mathrm{stat}) \pm 0.016\, (\mathrm{syst})$ at an effective temperature of $T_{\mathrm{eff}} = 219 \pm 8\, (\mathrm{syst})\,\mathrm{MeV}$, in good agreement with lattice-QCD calculations. Complementary studies in proton-lead (pPb) collisions are also presented to investigate possible quark-gluon plasma signatures in smaller systems.

arXiv Page | PDF

Score: 0

Cross-lingual robustness of LLM-brain alignment and its computational roots

Published: 2026-05-20 11:34:05

Authors: Ni Yang, Rui He, Philipp Homan, Iris Sommer, Davide Staub, Wolfram Hinzen

Categories: cs.CL

Abstract:
Large language models (LLMs) reliably predict neural activity during language comprehension and transformer depth has been interpreted as mirroring hierarchical cortical organization. However, it remains unclear whether such alignment extends to subcortical regions, overlaps spatially across languages, and what the computational roots of such alignment are. Here, we used a multilingual, whole-brain encoding framework to examine brain-LLM alignment across three typologically distinct languages: Mandarin, English, and French during naturalistic story listening. Our results show that across languages, transformer-based models predicted activity in a distributed landscape spanning widely distributed cortical functional networks like limbic, ventral attention, default mode network, and subcortical structures. Spatial alignment patterns showed substantial cross-linguistic overlap and remained largely stable across model layers, with limited layer progression consistent with functional cortical hierarchies. Contrary to previous evidence, contextual embeddings did not outperform static embeddings. To test candidate computational explanations, we examined whether layer-wise brain scores reflect surprisal and intrinsic dimensionality, and thereby predictive processing and information compression. Neither of these two computational metrics mirrored neural alignment profiles. Our findings suggest that brain-LLM alignment is spatially robust and cross-linguistically stable but not explainable from predictive uncertainty or representational geometry. Rather than directly reflecting shared hierarchical computation, neural predictivity may primarily arise from distributed lexical-semantic correspondences that generalize across languages.

arXiv Page | PDF

Score: 0

Ergodic measures of intermediate entropies for $\mathbb{Z}^{d}$-action

Published: 2026-05-20 11:31:35

Authors: Yage Liu, Ercai Chen, Xiaoyao Zhou

Categories: math.DS

Abstract:
For dynamical systems satisfying the approximate $\mathbb{Z}^{d}$ or $\mathbb{Z}_+^{d}$-product property and asymptotically entropy expansiveness, we establish a precise description of the structure of their space of invariant measures. In particular, we prove that the set of ergodic measures with any given intermediate entropy is generic in certain natural subspaces. As a consequence, this result confirms Katok's conjecture on the existence of ergodic measures with arbitrary intermediate entropy for such systems.

arXiv Page | PDF

Score: 0

Towards transistor-based quantum computing

Published: 2026-05-20 11:25:17

Authors: Y. -D. Liu, X. Xu, Q. -R. Wang, D. -S. Wang

Categories: quant-ph, cond-mat.mtrl-sci, cond-mat.str-el, cs.AR

Abstract:
In this work, we propose and study in depth a universal quantum computing architecture based on a quantum construction of transistors. Our teleportation-based quantum transistors, called ``telesistors'', are ground states of systems with symmetry-protected topological order, hence suppress certain noises and provide high-fidelity Clifford gates without the need for active error correction. This physical protection, quantified by the string order parameters, serves as a low-overhead foundation upon which conventional fault-tolerant encoding (e.g., with stabilizer codes) can be built to achieve universal quantum computation. This architecture shows rich connections with current known architectures, and some desirable merits especially compared with the qubit-based circuits regarding modularity, integration, and program storage. Our study shows that it is plausible to realize it with current technology in the near future.

arXiv Page | PDF

Score: 0

Dynamic Video Generation: Shaping Video Generation Across Time and Space

Published: 2026-05-20 11:24:02

Authors: Shikang Zheng, Jingkai Huang, Jiacheng Liu, Guantao Chen, Lixuan, Yuqi Lin, Peiliang Cai, Linfeng Zhang

Categories: cs.CV

Abstract:
Diffusion models have achieved impressive performance in video generation, but their iterative denoising process remains computationally expensive due to the large number of tokens processed at each timestep. Recently, progressive resolution sampling has emerged as a promising acceleration approach by reducing latent resolution in early stages. However, scaling this idea to video generation remains challenging, as the additional temporal dimension introduces diverse spatio-temporal demands across different videos, and compressing only a single dimension often leads to limited acceleration or degraded quality. Therefore, we propose DVG, a Dynamic Video Generation framework that jointly allocates computation across time and space, automatically selecting content-aware acceleration strategies without manual tuning or retraining. DVG achieves near-lossless acceleration across models and tasks, reaching up to 7 times speedup on HunyuanVideo and HunyuanVideo-1.5, and 18 times when combined with distillation, demonstrating its potential as a key component in today's large-scale efficient video generation systems. Our code is in supplementary material and will be released on Github.

arXiv Page | PDF

Score: 0

The Quiet Path from Seemingly Minor Design Errors to Workplace AI Incidents

Published: 2026-05-20 11:13:45

Authors: Julia De Miguel Velázquez, Sanja Šćepanović, Andrés Gvirtz, Daniele Quercia

Categories: cs.HC

Abstract:
Recent human-computer interaction (HCI) research has revealed a widespread misalignment between how developers design workplace artificial intelligence (AI) systems, and what workers actually need from them. Yet, little research has examined the effects of this gap, or how it may cause harm. We analyzed 1,524 reports of incidents in which AI systems were used to perform 171 occupational tasks across 12 industry sectors. Using an Large Language Model (LLM)-as-an-expert approach, we extracted the main traits of the AI systems involved in those incidents using an established framework of twelve traits. We then compared them with the traits that 202 workers highly familiar with those tasks would have preferred. We found that as many as 83\% of workplace incidents stem from worker-AI misalignments. In most cases, workers wanted systems that are precise, insightful, or personal, but instead received systems that are basic, simple, or general. Over the years, fast AI caused a considerable number of incidents, yet these declined, and imaginative AI, with the mass introduction of generative AI, started to cause incidents. We also compared the traits causing the incidents with the traits that 197 developers building AI systems for those tasks would have preferred. If the traits causing the incidents were the same as those designed by developers, then developers may be responsible for those incidents. We found that 74\% of task misalignments could be attributed to developers who tended to overfocus on efficiency and speed, especially for systems performing tasks in people-facing occupations such as those in the human resources sector. Our results call for design interventions that better align AI development with workers' needs, as without such corrections, workplace AI incidents are likely to persist, causing the invisible erosion of worker agency and organizational productivity.

arXiv Page | PDF

Score: 0

Modeling and Control of a Pneumatic Morphing Soft Quadrotor based on the SOFA Framework for Dynamic Soft Robotic Simulation

Published: 2026-05-20 11:06:56

Authors: F. Labra Caso, V. Sumathy, P. Ferrentino, V. Vanderborght, J. Haluska, G. Nikolakopoulos

Categories: cs.RO

Abstract:
This article presents a novel SOFA based finite element method for the soft body modeling and the corresponding dynamic simulation and control of a pneumatic morphing soft quadrotor. The proposed modeling preserves the physical interpretability and control structure of traditional quadrotor dynamics, while capturing the complex, time-varying behavior of pneumatically actuated soft arms. In SOFA, the soft pneumatically actuated arms are discretized as a tetrahedral mesh following an elastic material law that produces internal forces adequate to the real dynamic behavior of the body. Pneumatic actuation governed by both periodic and error-based control signals is applied within the internal cavities to analyze the morphing capability. Finally, a proportional-integral controller is proposed to study the controlled dynamic behavior and morphing capabilities of the pneumatic arm, wherein the pneumatic actuation to the soft arm is controlled to achieve the desired target position. The simulation results show the effectiveness of the proposed novel modeling framework and the related controller design.

arXiv Page | PDF

Score: 0

Beyond Text-to-SQL: An Agentic LLM System for Governed Enterprise Analytics APIs

Published: 2026-05-20 11:00:56

Authors: Gundeep Singh, Parsa Kavehzadeh, Jing Xia, Xue-Yong Fu, Julien Bouvier Tremblay, Md Tahmid Rahman Laskar, Vincent Lum, Shashi Bhushan TN

Categories: cs.CL, cs.AI

Abstract:
Enterprise analytics aims to make organizational data accessible for decision-making, yet non-technical users still face barriers when using traditional business intelligence tools or Text-to-SQL systems. While recent Text-to-SQL approaches based on Large Language Models (LLMs) promise natural language access to structured data, they fall short in enterprise settings where analytics pipelines rely on governed APIs rather than raw databases. In practice, these APIs encapsulate complex business logic to ensure consistency, auditability, and security. However, delegating mathematical or aggregation logic to an LLM introduces reliability and compliance risks. To this end, we present Analytic Agent, an LLM-based agentic system that translates natural language intents into secure interactions with enterprise analytics APIs. Evaluated on 90 real enterprise use cases constructed by domain experts, it reliably interprets user goals, validates permissions, executes governed queries, and generates compliant visualizations through multi-step reasoning and policy-aware orchestration.

arXiv Page | PDF

Score: 0

Microwave Linear Analog Computer (MiLAC)-Aided MIMO Radar Sensing: Transmit Beamforming Design and DoA Estimation

Published: 2026-05-20 10:53:37

Authors: Ziang Liu, Zheyu Wu, Bruno Clerckx

Categories: eess.SP, cs.IT

Abstract:
Multiple-input multiple-output (MIMO) radar has waveform diversity and large spatial degrees of freedom (DoFs), making it attractive for high-resolution sensing. Scaling MIMO radar to massive arrays can further improve sensing performance, but it also increases hardware cost, power consumption, and digital processing complexity. The microwave linear analog computer (MiLAC) can tackle these challenges by moving linear operations from the digital domain to the analog domain. MiLAC has shown promising benefits for communications in recent studies and this paper identifies its potential for radar sensing. Specifically, we consider both MiLAC-aided transmit beamforming and receiver-side two-dimensional discrete Fourier transform (2D-DFT)-based direction-of-arrival (DoA) estimation. For transmit beamforming, we formulate a weighted Cramer Rao bound (CRB) minimization problem under lossless and reciprocal MiLAC constraints and propose a penalty dual decomposition (PDD)-based iterative algorithm to address the non-convex problem. We further prove that MiLAC-aided and fully-digital beamforming achieve the same CRB. For receiver processing, we show that the 2D DFT can be implemented by a lossless reciprocal MiLAC, which enables analog-domain DoA estimation without digital optimization. Numerical results confirm the theoretical finding and show that the MiLAC-aided approach achieves the same CRB and DoA estimation performance as the fully-digital benchmark. Meanwhile, hardware cost and power consumption are reduced because only low-resolution DACs are required at the transmitter, while RF chains and ADCs are eliminated at the receiver. Moreover, performing the 2D DFT in the analog domain eliminates all digital DFT operations for DoA estimation.

arXiv Page | PDF

Score: 0

Reconstruction of Reionization Histories from 21 cm Power-Spectrum Evolution with Artificial Neural Networks

Published: 2026-05-20 10:46:27

Authors: Yu-Le Wang, Hayato Shimabukuro

Categories: astro-ph.CO

Abstract:
We investigate whether the redshift evolution of the fixed-$k$ dimensionless 21 cm power spectrum, $Δ^2_{21}(k, z)$, contains sufficient information to reconstruct reionization histories $x_{\mathrm{HI}}(z)$ with artificial neural networks. Using semi-numerical realizations generated within a restricted three-parameter 21cmFAST model family, we train a compact feed-forward network to learn the inverse mapping from power-spectrum trajectories to the neutral-fraction history over $6 \le z \le 15$. For $k = 0.1$, $0.5$, and $1.0\ h\ \mathrm{Mpc}^{-1}$, representative tests on an independent test set show that the midpoint redshift $z_{50}$ is recovered more accurately than the duration $Δz = z_{75} - z_{25}$: $z_{50}$ is reconstructed with MAE = 0.0046 and RMSE = 0.0100, whereas $Δz$ yields MAE = 0.0302 and RMSE = 0.0378. This result indicates that fixed-$k$ power-spectrum evolution carries stronger information about the timing of reionization than about the detailed width of the transition within the adopted prior. We further test an idealized foreground-free SKA1-Low-like thermal-plus-sample-variance noise model and find that the reconstruction remains stable in the favorable signal-to-noise regime considered here. These results demonstrate that neural networks can serve as prior-dependent inverse mapping for reconstructing reionization histories from 21 cm power-spectrum evolution.

arXiv Page | PDF

Score: 0

Wartime Controls, Political Connections, and the Pricing of Zaibatsu Rents in Japan, 1930-1943

Published: 2026-05-20 10:45:19

Authors: Keiichi Morimoto, Akihiko Noda, Takenobu Yuki

Categories: econ.GN, q-fin.PR, q-fin.ST

Abstract:
This paper examines how wartime economic controls shaped stock-price formation in Japan from 1930 to 1943. We develop a four-portfolio asset-pricing model in which zaibatsu affiliation affects expected payoffs and the translation of valuations into economic scale through lower financing wedges. We then construct daily capitalization-weighted indices and four benchmark portfolios based on a two-by-two sort by zaibatsu affiliation and military orientation. Using a CAPM-AR(p)-SV event-study framework that allows for serial correlation and stochastic volatility, we show that the model rationalizes capitalization concentration, segmented abnormal returns, delayed cumulative adjustment, regime-risk insulation of zaibatsu portfolios, and zaibatsu-concentrated responses to embedded-rent or group-continuation shocks. The evidence is consistent not with a collapse of semi-strong efficiency, but with institutionally contingent efficiency: stock prices continued to respond to news while capitalizing uneven access to credit, materials, and procurement.

arXiv Page | PDF

Score: 0

Playing Devil's Advocate: Off-the-Shelf Persona Vectors Rival Targeted Steering for Sycophancy

Published: 2026-05-20 10:43:17

Authors: Ishaan Kelkar, Nebras Alam, Vikram Kakaria, Madhur Panwar, Vasu Sharma, Maheep Chaudhary

Categories: cs.AI, cs.CL, cs.LG

Abstract:
We study the effect of different persona on \textbf{sycophancy}: model's agreement with users even when the user is incorrect. The standard mitigation, Contrastive Activation Addition (CAA), derives a steering direction from labelled pairs of sycophantic and honest responses. This study evaluates whether off-the-shelf persona steering vectors, originally developed for general role-playing and not trained on sycophancy data, can serve as an alternative. In two instruction-tuned models, steering toward personas characterised by doubt or scrutiny reduces sycophancy to approximately $68\%$ and $98\%$ of CAA's effect, and, unlike CAA, maintains accuracy when the user is correct. The effect is also asymmetric: steering toward agreeable personas does not produce a mirror increase in sycophancy. Geometrically, the persona vector is largely independent of the direction of sycophancy in activation space. Collectively, these findings suggest that sycophancy is better understood as a persona-level property rather than a single steerable direction. We release our code here: https://anonymous.4open.science/r/Sycophancy-Steering-9DF0/.

arXiv Page | PDF

Score: 0

Convergence Analysis of Evolution Strategies for Mixed-Integer Optimization

Published: 2026-05-20 10:38:12

Authors: Ryoki Hamano, Kento Uchida, Shinichi Shirakawa

Categories: cs.NE

Abstract:
Mixed-integer extensions of evolution strategies (ES) that discretize selected coordinates of sampled continuous vectors often impose a lower bound on the standard deviation of integer variables to prevent premature convergence. While these methods show promising empirical results, this handling can slow the convergence of continuous variables, and its impact has lacked a clear theoretical account. In this paper, we provide a convergence analysis of evolution strategies for mixed-integer optimization, inspired by the drift analysis of the (1+1)-ES in the continuous domain. Specifically, we consider two (1+1)-ES variants for mixed-integer domains: (1+1)-LB-ES, which introduces a lower bound on the standard deviation for integer variables, and (1+1)-LUB-ES, which combines both lower and upper bounds to enhance the convergence of the continuous variables. Focusing on the optimization phase after the integer variables have been optimized, we rigorously analyze their convergence behavior on a benchmark function designed for mixed-integer domains. Our results show that (1+1)-LB-ES can suffer from premature convergence when the number of integer variables is large, while (1+1)-LUB-ES achieves linear convergence under suitable parameter settings. These findings provide theoretical insights into the impact of integer handling on convergence performance and guidance for the design of mixed-integer ES.

arXiv Page | PDF

Score: 0

Weighted Uniform Endpoint Majorants for Integrals Involving Modified Bessel Functions

Published: 2026-05-20 10:14:17

Authors: Yaoran Yang, Yutong Zhang

Categories: math.CA

Abstract:
We give an affirmative full-range solution to Gaunt's 2019 Open Problem~2.10. The problem asks whether, for every \(ν>-1/2\) and \(0<γ<1\), the reciprocal-power integral \(\int_0^x e^{-γt}I_ν(t)t^{-ν}\,\dd t\) is bounded by a constant multiple of \(e^{-γx}I_{ν+1}(x)x^{-ν}\), uniformly for all \(x>0\). Earlier exponential-tilt estimates proved such endpoint majorants only under an additional smallness condition on \(γ\). We prove the estimate throughout the natural range \(0<γ<1\), with an explicit admissible constant. More generally, if \(μ>-1\), \(q>-1\), \(0<γ<1\), and \(w(x)x^{-q}\) is nondecreasing on \((0,\infty)\), then for every \(θ\in(γ,1)\), \(\int_0^x e^{-γt}w(t)t^{-μ}I_μ(t)\,\dd t\) is controlled by an explicit multiple of \(e^{-γx}w(x)x^{-μ}I_{μ+1}(x)\). The case \(w\equiv1\), \(q=0\), and \(μ=ν\) resolves Gaunt's problem. The weighted theorem also yields shifted-order and moment estimates, applies to approximate power weights and monotone regularly varying amplitudes, and provides two-sided estimates under a reversed comparison. We further analyze the sharp power-weighted quotient via endpoint expansions, a stationary equation, and parameter monotonicity.

arXiv Page | PDF

Score: 0

Diagnosing Overhead in Dispatch Operations: Cross-architecture Observatory

Published: 2026-05-20 10:14:00

Authors: Bole Ma, Jan Eitzinger, Harald Koestler, Gerhard Wellein

Categories: cs.DC, cs.AI, cs.LG

Abstract:
AlltoAll dispatch is the dominant bottleneck of MoE expert parallelism, and the interconnect community has responded with four families of mitigations: predictive sample placement, adaptive expert relayout, hierarchical collectives, and EP-aware topology. All four rest on two assumptions about the workload. The first is that routing imbalance is correctable by the system layer. The second is that the mock-token benchmarks evaluating them faithfully represent production routing. We introduce DODOCO to test both assumptions. We instrument five MoE checkpoints spanning five sequence-mixer designs (DeepSeek-V2-Lite MLA, DeepSeek-MoE-16B MHA, Qwen3-30B GQA, Nemotron-30B Mamba-2, Qwen3.5-35B GDN) under a 5 by 6 grid of data conditions plus a matched EP scan from 4 to 32 ranks on H100s; both assumptions fail. Scaling EP changes the per-expert max/mean token ratio by at most 5% within every architecture's measurable range: the straggler is intrinsic to the routing decision the model makes, not to how its experts land on ranks. Mock tokens overestimate routing Gini by up to a factor of 2.35 and fabricate a batch-size scaling trend that vanishes the moment real text replaces random IDs. A third pattern, unexpected, emerges from the same matrix: the five architectures cleave into two stable bands. MHA and Mamba-2 (data-resilient) drop to Gini 0.105 and 0.150 on wikitext. MLA and GDN (persistently concentrated) stay above 0.24 on every real-text condition and reach 0.29 to 0.38 on mock. GQA is the intermediate case. These bands, not the EP degree or the mock-data profile, are the right workload input to AlltoAll-aware interconnect and dispatch design.

arXiv Page | PDF

Score: 0

Equilibrium and dynamics of a three-state opinion model on a network of networks

Published: 2026-05-20 10:09:05

Authors: Irene Ferri, Albert Díaz-Guilera, Hiroki Sayama

Categories: physics.soc-ph, cond-mat.stat-mech

Abstract:
Opinion formation models typically represent each individual as a single variable. However, in practice each individual holds interconnected beliefs whose internal organization may influence collective outcomes. To explore this dependence, we study a three-state opinion model on a network of networks in which each agent has an internal belief graph and interacts with other agents through an external social graph. Each belief can take two opposite polarized states or a neutral one and a neutrality parameter tunes the relative conviction of the neutral stance. We incorporate temperature into the model to account for external social agitation and for the tolerance of internal cognitive dissonance. We explore the stationary state and dynamics of the model using analytical approaches and Monte Carlo simulations on a fully connected external social graph, with internal belief topologies given by one-dimensional chains, cliques, and star-like structures, where there is a central core belief to which all other beliefs are connected. We find that the critical temperature at which the polarized consensus destabilizes increases with the addition of more beliefs to star-like agents but saturates in the case of ring- and clique-like internal topologies. We also consider binary mixtures of agents with different internal topologies in equal proportions, showing that the interplay between agents is regime-dependent, with the dominant topology depending on the value of the neutrality parameter.

arXiv Page | PDF

Score: 0

Parallel Context Modeling for Sliding Window Attention in Neural Video Coding

Published: 2026-05-20 10:06:23

Authors: Alexander Kopte, André Kaup

Categories: eess.IV

Abstract:
Most neural video codecs rely on temporal conditioning, which makes them susceptible to error propagation over long sequences. While Transformer-based architectures like the VCT offer a drift-free alternative, they suffer from high computational complexity and inferior RD performance. The recent SWA addresses these shortcomings by reducing complexity and enhancing RD performance, yet it restricts decoding to a strictly sequential raster-scan order, creating a critical bottleneck in decoding latency. To resolve this, we propose P-SWA, utilizing diagonal wavefronts to enable parallel decoding. By embedding a hyperprior and introducing an accumulator to fuse side information and local spatial context, our method increases decoding speed by 36% over the parallel VCT. Simultaneously, it achieves Bjøntegaard Delta-rate savings of up to 10.0% for I-frames and 7.1% for P-frames over the SWA baseline.

arXiv Page | PDF

Score: 0

Dawn of the Milky Way disk: Determination of when a rotationally supported disk appears and dating the spin-up of the disk

Published: 2026-05-20 09:44:23

Authors: Sofia Feltzing, Diane Feuillet, Thomas Bensby

Categories: astro-ph.GA

Abstract:
Spiral galaxies, like the Milky Way, transform at some point in time into a rotationally supported system. Using an extant data-set consisting of 319 835 sub-giants from LAMOST with precise ages from the literature, we determine, for the first time the age when the Milky Way disk spins up, i.e. when the mean circular velocity changes from halo-like to disk-like. We find in concordance previous studies that the spin-up takes place for -1.25 < [Fe/H] <- 0.9 and we can date this transition to a mean age of 12.1 +/- 2.8 Gyr (median age 12.4 Gyr). We further study when the disk became rotationally supported, i.e. when the ordered, disky motion dominates over the random motions. We find that this happens for $-1.25<$[Fe/H]$<-1$. The transition is very rapid in age. This gives support to that the spin-up seen in this and other works genuinely traces the motion to a rotationally supported disk, which has not previously been shown. These transitions are traced by the high-alpha stars. while the low-alpha stars do not spin-up but start directly at approximately the circular velocity seen for the Sun today. The low-alpha disk is rotationally supported with no transition period in [Fe/H] or in age.

arXiv Page | PDF

Score: 0

Boundaries of Siegel Disks for Conservative Systems

Published: 2026-05-20 09:41:50

Authors: F. M. Tangerman

Categories: math.DS, math.CV

Abstract:
In this paper, we study a particular conservative standard map in complex dimension 2. In this example, Siegel disks can be visualized and analyzed numerically as to the smoothness of their boundaries. We formulate and numerically support some conjectures.

arXiv Page | PDF

Score: 0

Bridging Structure and Language: Graph-Based Visual Reasoning for Autonomous Road Understanding

Published: 2026-05-20 09:28:06

Authors: Lena Wild, Katie Z Luo, Marco Pavone

Categories: cs.CV

Abstract:
Structured road understanding of lane geometry, topology, and traffic element relationships is foundational to safe autonomous driving. While vision-language models (VLMs) offer promising semantic flexibility, they lack the geometric and relational grounding required for precise road reasoning. Conversely, traditional modular systems, e.g., HD maps and topological road graphs, provide structural precision but remain semantically rigid. To bridge this gap, we introduce the Combined Road Substrate (CRS), a graph-grounded framework that makes geometric road structure and open-vocabulary semantics jointly executable in a single representation. CRS enables the automatic generation of compositionally complex and linguistically varied question-answer pairs via recursive graph queries, augmented with a "grounding for free" mechanism that ensures logical traceability to specific map elements, and procedurally extracted chain-of-thought supervision traces. We demonstrate that state-of-the-art VLMs - including large, closed-source models - struggle significantly with structured road reasoning, yet training a small 2- or 4-billion-parameter model with as few as 20 to 80 CRS-enriched scenes yields stable gains in compositional reasoning tasks of varying depth. Analysis of model behavior via verifiable reasoning traces reveals a systematic shift in failure modes: whereas baseline models fail at relational scene understanding, CRS-trained models reduce failures to attribute recognition, suggesting that the primary bottleneck in road understanding is not model scale, but the absence of structured supervision.

arXiv Page | PDF

Score: 0

WiXus: A Wheeled-Legged Robot with Wire-Driven Environmental Utilizing to Integrate Mobility and Manipulation

Published: 2026-05-20 09:19:48

Authors: Shintaro Inoue, Kento Kawaharazuka, Temma Suzuki, Sota Yuzaki, Kei Okada

Categories: cs.RO

Abstract:
Wheeled-legged robots, which have wheels at their feet and achieve high mobility by coordinating wheel drive and leg drive, have been developed. These robots have been developed purely as platforms specialized for locomotion. Therefore, they do not have a means to repurpose their legs for roles other than locomotion, such as object manipulation or tool utilization. In this paper, we address the problem of how to draw out the potential task-execution capability of the legs by freeing them from the roles of locomotion through external body support. To this end, we propose and develop a new robot, WiXus, which fuses a wheeled-legged mechanism with a wire-driven mechanism that utilizes the external environment. The developed WiXus demonstrates not only planar locomotion with wheeled-legged drive, but also three-dimensional mobility such as cliff climbing by coordinating wire-driven and wheeled-legged actuation. Furthermore, by suspending the body with wire-driven actuation, WiXus successfully repurpose its legs as arms to perform object manipulation, (e.g., rescuing a dog (stuffed animal)), and tool utilization (e.g., harvesting an apple (mockup) with loppers). This study demonstrates that the approach of utilizing the environment with wire-driven actuation is a new design principle that extends the operational domain of wheeled-legged robots.

arXiv Page | PDF

Score: 0

ParaCell: Paravirtualized Secure Containers with Lightweight Intra-Container Isolation and Intent-Driven Memory Management

Published: 2026-05-20 08:53:35

Authors: Yiyang Wu, Xunjie Wang, Jinyu Gu, Haibo Chen

Categories: cs.OS

Abstract:
Secure containers isolate each container with its own kernel, mitigating shared-kernel attacks prevalent in traditional container systems. However, existing designs still face a fundamental isolation--performance trade-off. Nested-cloud deployments amplify the cost of VM exits and page-table management, while emerging agentic workloads expose bursty memory demand that requires fine-grained elasticity. We attribute this trade-off to two root causes. First, existing designs lack lightweight intra-container isolation primitives for frequent container user--kernel transitions. Second, the host treats container memory management as opaque, forcing reactive secondary faults and coarse-grained huge page mappings to amortize their cost. This paper presents ParaCell, a paravirtualized secure container runtime built on two insights. First, intra-address-space hardware protection primitives can provide lightweight intra-container isolation. ParaCell uses MPK-based XGates to isolate the container user and container kernel within a single address space, turning frequent user--kernel transitions into direct domain switches. Second, container kernel allocators already encode memory-management intent. ParaCell introduces Pager to interpose on allocation and free events, batch proactive GPA to HPA bindings and unbindings, and avoid reactive shadow page-table faults while preserving fine-grained memory elasticity. ParaCell is implemented as a drop-in replacement for RunV. Our experiments demonstrate that, across traditional cloud and emerging agent applications, ParaCell reduces latency by up to 57% and 79% over PVM, and by up to 33% and 88% over RunV, in bare-metal and nested setups, respectively. On agent workloads, ParaCell saves up to 35.6% memory compared with the state-of-the-art VM memory reclamation technique, HyperAlloc.

arXiv Page | PDF

Score: 0

HDMoE: A Hierarchical Decoupling-Fusion Mixture-of-Experts Framework for Multimodal Cancer Survival Prediction

Published: 2026-05-20 08:31:09

Authors: Huayi Wang, Haochao Ying, Yuyang Xu, Qiyao Zheng, jun wang, Cheng Zhang, Ying Sun, Jian Wu

Categories: cs.CV

Abstract:
Multimodal survival prediction, a crucial yet challenging task, demands the integration of multimodal medical data (\eg Whole Slide Images (WSIs) and Genomic Profiles) to achieve accurate prognostic modeling. Given the inherent heterogeneity across modalities, the feature decoupling-fusion paradigm has emerged as a dominant approach. However, these methods have the following shortcomings: (1) fail to reduce the redundant information of modality features before decoupling, which negatively affects the feature decoupling and fusion effect;(2) lack the ability to model the fine-grained relationships of the features and capture the local information interactions between intra- and inter-modality features. To address these issues, we propose a \underline{H}ierarchical \underline{D}ecoupling-Fusion \underline{M}ixture-\underline{o}f-\underline{E}xperts (HDMoE) framework with two levels of MoE and \underline{R}andom \underline{F}eature \underline{R}eorganization (RFR) modules.In the first-level MoE, shared experts and routed experts are employed to remove redundant information and extract fine-grained specific features within each modality, while the second-level MoE facilitates fine-grained inter-modality feature decoupling. Besides, we design two RFR modules following each level of MoE to finely fuse intra- and inter-modality features, which can help the model capture more fine-grained relationships between modalities. Extensive experimental results on our private Liver Cancer (LC) and three TCGA public datasets confirm the effectiveness of our proposed method. Codes are available at https://github.com/ZJUMAI/HDMoE.

arXiv Page | PDF

Score: 0

The TNG50-SKIRT Atlas: Multi-wavelength nonparametric galaxy morphology

Published: 2026-05-20 08:25:46

Authors: Sena Bokona Tulu, Maarten Baes, Angelos Nersesian, Tolu Biressa, Vicente Rodriguez-Gomez, Andrea Gebek, Marco Martorano, Abdissa Tassama Emana

Categories: astro-ph.GA

Abstract:
Context: Galaxy morphology is a fundamental property to describe galaxy evolution. However, the observed morphology of a particular galaxy may depend on the observed wavelength. Aims: Our aim is to investigate the wavelength dependence and the effect of dust attenuation on nonparametric morphology indicators. Methods: We use the TNG50-SKIRT Atlas, an atlas of synthetic UV to near-infrared (NIR) broadband images for a complete stellar-mass-selected sample of 1154 galaxies extracted from the TNG50 cosmological simulation at $z = 0$. For each image, we calculate four nonparametric morphology indicators using the StatMorph code. Results: We find that the known correlations between the stellar mass and the morphological parameters measured in the optical, together with the Gini-$M_{20}$, concentration-Gini, and concentration-$M_{20}$ planes, are fully consistent with observational data. However, nonparametric morphological indicators change significantly with wavelength and that this wavelength dependence is stronger for disc-dominated than for bulge-dominated galaxies. The wavelength dependence of the morphology of our simulated TNG50 galaxies is consistent with measurements of local galaxies from the SINGS survey. We demonstrate that the effect of dust attenuation on nonparametric morphology indicators is modest across the full galaxy population but can be significant for individual galaxies.

arXiv Page | PDF

Score: 0

Relativistic Scattering in the Funnel of Cygnus X-3

Published: 2026-05-20 08:19:52

Authors: Suraj K. Chaurasia, Ranjeev Misra, Amit Pathak

Categories: astro-ph.HE

Abstract:
Cygnus X-3 presents significant challenges to standard accretion models. Recent polarimetric observations by IXPE reveal high polarization degrees (PD) in the hard state ($\sim 23\%$) and unexpectedly significant polarization in the soft state ($\sim 12\%$), which are difficult to reconcile with static scattering models at low inclination ($i \approx 30^\circ$). We present a relativistic scattering model within a funnel-shaped geometry that resolves this discrepancy. We show that a single funnel-outflow configuration with variable bulk velocity $β$ can reproduce both polarization states, with lower velocities ($β\approx 0$) yielding $\sim 12\%$ polarization (soft state) and mildly relativistic velocities ($β\lesssim 0.4$) producing $\sim 23\%$ polarization (hard state) at $i \approx 30^\circ$ for half funnel opening angles of $\sim 13^\circ$-$16^\circ$. Relativistic aberration modifies the effective scattering angle in the comoving frame, enhancing polarization in the hard state while recovering the static limit in the soft state. The model also yields a consistent estimate of the intrinsic luminosity, of order $\sim 10^{40}$ erg s$^{-1}$, supporting a super-Eddington interpretation. This framework provides a unified explanation of the observed polarization properties of Cygnus X-3.

arXiv Page | PDF

Score: 0

Down going muon rate monitoring in the ANTARES detector

Published: 2026-05-20 08:19:10

Authors: K. Gracheva, M. Anghinolfi, V. Kulikovskiy, E. Shirokov, Y. Yakovenko

Categories: astro-ph.IM, astro-ph.HE

Abstract:
Large underwater telescopes have been proposed as a challenging method to measure high energy neutrinos from astrophysical objects. In recent years, The Antares collaboration has designed and realized the first detector of this type in the Mediterranean Sea. Muon tracks produced by the neutrino interaction in the surrounding medium are reconstructed from the arrival time and the number of photo-electrons of the Cherenkov light measured by the Photomultiplier tubes (PMT) array of the detector. In order to provide sufficient statistics, the events from various periods in the year must be summed together taking care of the various environmental conditions and detector configurations. In this note we describe effective criteria to group compatible runs based on the effective number of active PMTs in each run.

arXiv Page | PDF

Score: 0

NeighborDiv: Training-free Zero-shot Generalist Graph Anomaly Detection via Neighbor Diversity

Published: 2026-05-20 08:16:13

Authors: Kaifeng Wei, Teng Liu, Liang Dong, Xiubo Liang, Yuke Li

Categories: cs.LG

Abstract:
Graph Anomaly Detection (GAD) is increasingly shifting to Generalist GAD (GGAD) for cross-domain "one-for-all" detection, but existing GGAD methods predominantly rely on the neighbor consistency principle, falling into the \textbf{Node-to-Neighbor Consistency Paradigm} for anomaly quantification. These methods suffer from complex training pipelines, heavy training data dependency, high computational costs, and unstable cross-domain generalization. To address these limitations, we propose NeighborDiv, a training-free generalist graph anomaly detection framework based on neighbor diversity. Departing from the dominant Node-to-Neighbor Consistency Paradigm, we shift the focus to the \textbf{Neighbor-to-Neighbor Diversity Paradigm}, and uncover that the internal structural dispersion of a node's neighbor set is a powerful, independently discriminative anomaly signal. We quantify neighbor diversity via the variance of inter-neighbor feature similarities, which captures how a node organizes its local graph environment, and operates independently of conventional node-to-neighbor consistency frameworks. Extensive experiments under two standard GGAD evaluation paradigms show NeighborDiv achieves state-of-the-art performance, with relative gains of 10.25% in average AUC and 17.78% in average AP over the second-best baseline under Single-Domain Independent Training (SDIT), and 6.89%/9.58% in AUC/AP under Unified Multi-Domain Training (UMDT), respectively. Notably, NeighborDiv yields zero performance volatility across all datasets, eliminating training-set dependency and establishing a lightweight and highly practical GGAD framework.

arXiv Page | PDF

Score: 0

Robustness Analysis of USmorph: II. Optimizing Feature Extraction, Dimensionality Reduction, and Clustering for Unsupervised Galaxy Morphology Classification

Published: 2026-05-20 08:08:17

Authors: Guanwen Fang, Xiaolei Yin, Yirui Zheng, Zesen Lin, Shiwei Zhu, Jie Song, Chichun Zhou, Xu Kong

Categories: astro-ph.GA

Abstract:
We conduct a systematic robustness analysis of the unsupervised machine learning module within the hybrid framework \texttt{USmorph}. This module automatically discovers morphological structures from large-scale galaxy images, forming the foundation of the complete classification workflow. We evaluate five pre-trained models for feature extraction and identify an ImageNet-pretrained AlexNet as the most effective for capturing discriminative morphological features. UMAP is chosen for dimensionality reduction due to its optimal balance between preserving high-dimensional structure and computational efficiency. To enhance clustering stability, we propose a Bagging-based multi-cluster voting scheme, which significantly improves label consistency and cluster purity. We compare the convergence, scalability, and quality of five clustering algorithms, finding that the Bagging voting scheme has the best performance with the combination of K-means, Birch, and Agg. A bagging clustering number of $K=16$ is used to achieve the optimal balance between classification granularity and manual validation efficiency. Our tests show that: (1) the t-distributed stochastic neighbor embedding (t-SNE) reveals clear, compact cluster boundaries in low-dimensional space with strong feature separability; (2) the morphology classification results align with galaxy evolution theory, showing physically plausible distributions of different types in parameter space. These results demonstrate the technical robustness and scientific credibility of \texttt{USmorph}, establishing it as a reliable method for automated morphological classification in future large-scale surveys such as the China Space Station Telescope (CSST) mission.

arXiv Page | PDF

Score: 0

Impact of matter effects on the unitarity test of lepton mixing

Published: 2026-05-20 07:48:29

Authors: Ryuichiro Kitano, Joe Sato, Sho Sugama

Categories: hep-ph, hep-ex

Abstract:
Testing the unitarity of the lepton mixing matrix, in a manner analogous to the unitarity tests of the CKM matrix in the quark sector, is an important step toward probing physics beyond the standard three-generation framework. In long baseline neutrino oscillation experiments, the formula of the oscillation probabilities can be written as a sum of terms with various combinations of the mixing-matrix elements, and their coefficients depend differently on energy. By observing the spectral information of long baseline experiments such as T2HK and a future neutrino factory at J-PARC with a $ν_e$ beam, the elements of the mixing matrix can be extracted without assuming a specific parametrization of the mixing matrix. We investigate how such an extraction method can be applied to neutrino oscillations by taking into account matter effects, and discuss how one can test unitarity of the mixing matrix in future long baseline experiments. As a concrete example, we examine the unitarity test by using a four-generation model, where we look at a quantity which should be vanishing in a unitary model. Among possible combinations of measurements, the most powerful test can be provided from the energy spectra of the CP-conjugate appearance channels $ν_μ\to ν_e$ and $\barν_μ\to \barν_e$ at T2HK, as well as from the T-conjugate pair $ν_μ\to ν_e$ and $ν_e \to ν_μ$ available at neutrino factories.

arXiv Page | PDF

Score: 0

SEABAD: A Tropical Bird Activity Detection Dataset for Passive Acoustic Monitoring

Published: 2026-05-20 07:44:39

Authors: Muhammad Mun'im Ahmad Zabidi, Mohd Yamani Idna Idris, Norisma Idris

Categories: cs.SD, eess.AS

Abstract:
Passive acoustic monitoring (PAM) enables large-scale biodiversity assessment, but continuous recording generates large amounts of non-informative audio, creating challenges for storage, power consumption, and long-term edge deployment. Bird audio detection (BAD), which identifies bird vocalizations, can reduce this burden by filtering irrelevant recordings before downstream analysis. However, most BAD systems are trained on temperate datasets despite tropical soundscapes being denser, more species-rich, and acoustically unpredictable. To address this gap, we introduce SEABAD (Southeast Asian Bird Activity Detection), a dataset of 50,000 curated three-second clips from Southeast Asian soundscapes, evenly balanced between bird-present and bird-absent samples. The dataset spans 1,677 bird species and is standardized to 16 kHz mono audio for embedded and low-power inference. We developed a dual-branch curation pipeline: a six-stage positive-label workflow applied to Xeno-Canto recordings, alongside six source-specific negative-label extractions from environmental datasets. These procedures reduced class imbalance by 13.7% (Gini coefficient: 0.601 to 0.519). A manual audit of 1,000 positive clips confirmed 97.8% +/- 0.9% labeling accuracy. Baseline experiments using MobileNetV3-Small achieved 99.57% +/- 0.25% accuracy and 0.9985 +/- 0.0002 AUC across three random seeds. SEABAD and the full curation pipeline are publicly released to support tropical BAD research and energy-efficient acoustic monitoring.

arXiv Page | PDF

Score: 0