220
Adaptive Risk Mitigation in Demand Learning
arXiv:2011.10690v2 Announce Type: replace
Abstract: We study dynamic pricing of a product with an unknown demand distribution over a finite horizon. Departing from the standard no-regret learning environment in which prices can be adjusted at any time, we restrict price changes to predetermined points in time to reflect common retail practice. This constraint, coupled with demand model ambiguity and an unknown customer arrival pattern, imposes a high risk of revenue loss, as a price based on a misestimated demand model may be applied to many customers before it can be revised. We develop an adaptive risk learning (ARL) framework that embeds a data-driven ambiguity set (DAS) to quantify demand model ambiguity by adapting to the unknown arrival pattern. Initially, when arrivals are few, the DAS includes a broad set of plausible demand models, reflecting high ambiguity and revenue risk. As new data is collected through pricing, the DAS progressively shrinks, capturing the reduction in model ambiguity and associated risk. We establish the probabilistic convergence of the DAS to the true demand model and derive a regret bound for the ARL policy that explicitly links revenue loss to the data required for the DAS to identify the true model with high probability. The dependence of our regret bound on the arrival pattern is unique to our constrained dynamic pricing problem and contrasts with no-regret learning environments, where regret is constant and arrival-pattern independent. Relaxing the constraint on infrequent price changes, we show that ARL attains the known constant regret bound. Numerical experiments further demonstrate that ARL outperforms benchmarks that prioritize either regret or risk alone by adaptively balancing both without knowledge of the arrival pattern. This adaptive risk adjustment is crucial for achieving high revenues and low downside risk when prices are sticky and both demand and arrival patterns are unknown.
Abstract: We study dynamic pricing of a product with an unknown demand distribution over a finite horizon. Departing from the standard no-regret learning environment in which prices can be adjusted at any time, we restrict price changes to predetermined points in time to reflect common retail practice. This constraint, coupled with demand model ambiguity and an unknown customer arrival pattern, imposes a high risk of revenue loss, as a price based on a misestimated demand model may be applied to many customers before it can be revised. We develop an adaptive risk learning (ARL) framework that embeds a data-driven ambiguity set (DAS) to quantify demand model ambiguity by adapting to the unknown arrival pattern. Initially, when arrivals are few, the DAS includes a broad set of plausible demand models, reflecting high ambiguity and revenue risk. As new data is collected through pricing, the DAS progressively shrinks, capturing the reduction in model ambiguity and associated risk. We establish the probabilistic convergence of the DAS to the true demand model and derive a regret bound for the ARL policy that explicitly links revenue loss to the data required for the DAS to identify the true model with high probability. The dependence of our regret bound on the arrival pattern is unique to our constrained dynamic pricing problem and contrasts with no-regret learning environments, where regret is constant and arrival-pattern independent. Relaxing the constraint on infrequent price changes, we show that ARL attains the known constant regret bound. Numerical experiments further demonstrate that ARL outperforms benchmarks that prioritize either regret or risk alone by adaptively balancing both without knowledge of the arrival pattern. This adaptive risk adjustment is crucial for achieving high revenues and low downside risk when prices are sticky and both demand and arrival patterns are unknown.
No comments yet.