219

arXiv:2512.11829v1 Announce Type: new
Abstract: Adaptive behavior in volatile environments requires agents to switch among value-control regimes across latent contexts, but maintaining separate preferences, policy biases, and action-confidence parameters for every situation is intractable. We introduce value profiles: a small set of reusable bundles of value-related parameters (outcome preferences, policy priors, and policy precision) assigned to hidden states in a generative model. As posterior beliefs over states evolve trial by trial, effective control parameters arise via belief-weighted mixing, enabling state-conditional strategy recruitment without requiring independent parameters for each context. We evaluate this framework in probabilistic reversal learning, comparing static-precision, entropy-coupled dynamic-precision, and profile-based models using cross-validated log-likelihood and information criteria. Model comparison favors the profile-based model over simpler alternatives (about 100-point AIC differences), and parameter-recovery analyses support structural identifiability even when context must be inferred from noisy observations. Model-based inference further suggests that adaptive control in this task is driven primarily by modulation of policy priors rather than policy precision, with gradual belief-dependent profile recruitment consistent with state-conditional (not purely uncertainty-driven) control. Overall, reusable value profiles provide a tractable computational account of belief-conditioned value control in volatile environments and yield testable signatures of belief-dependent control and behavioral flexibility.