Conference
Bandit Learning
2026
Ranking-and-Selection with Multiple Correct Answers and Non-Answerable Estimates
Qiaoqiao Wang and Wei You
Winter Simulation Conference (WSC 2026)
Abstract
We study fixed-precision ranking-and-selection in structured settings where the answer may be non-unique and where noisy estimates may temporarily admit no valid answer at all. This phenomenon arises naturally in problems such as multi-fidelity ranking-and-selection and identifying a Condorcet winner from pairwise comparisons. To address this, we propose a unified framework based on answer-wise acceptance sets, restricted generalized likelihood ratio stopping, and an answer–pitfall decomposition that yields a max-max-min characteristic value and a common sampling principle. We introduce ENDS, a general procedure that combines estimation, nomination, pitfall detection, and cost-aware information-directed sampling. We instantiate ENDS for various problems by deriving explicit formulas. Extensive numerical experiments show that this unified recipe performs well across a broad range of pure-exploration problems and offers a practical proof of concept for a common methodology.
Journal
Stochastic Optimization
2026
Stochastic Compositional Optimization with Compositional Constraints
Shuoguang Yang, Wei You, Zhe Zhang, and Ethan X. Fang
INFORMS Journal on Optimization, 8(1), 1–21
Abstract
Stochastic compositional optimization (SCO) has attracted considerable attention due to its broad applicability to important real-world problems. However, existing work on SCO typically assumes that the projection within a solution update is straightforward, which is not the case for problem instances where constraints are in the form of expectations, such as empirical conditional value-at-risk constraints. In this paper, we introduce a novel model that integrates single-level expected value and two-level compositional constraints into the existing SCO framework. Our model has wide applicability to data-driven optimization, fairness optimization, and risk management, including risk-averse optimization and high-moment portfolio selection, and is capable of handling multiple constraints. Additionally, we propose a class of primal-dual algorithms that generate sequences converging to the optimal solution at a rate of $\mathcal{O}(1/\sqrt{N})$ under both single-level and two-level compositional expected-value constraints, where $N$ is the iteration counter, thus establishing benchmarks in expected value constrained SCO. Numerical experiments show the efficiency of our algorithm over real-world applications.
Journal
Bandit Learning
2026
Minimax Optimality in Contextual Dynamic Pricing with General Valuation Models
Xueping Gong, Wei You, and Jiheng Zhang
Operations Research, 74(2), 879–897
Abstract
We study contextual dynamic pricing, where a decision maker posts personalized prices based on observable contexts and receives binary purchase feedback indicating whether the customer's valuation exceeds the price. Each valuation is modeled as an unknown latent function of the context, corrupted by independent and identically distributed market noise from an unknown distribution. Relying only on Lipschitz continuity of the noise distribution and bounded valuations, we propose a minimax-optimal algorithm. To accommodate the unknown distribution, our method discretizes the relevant noise range to form a finite set of candidate prices, then applies layered data partitioning to obtain confidence bounds substantially tighter than those derived via the elliptical-potential lemma. A key advantage is that estimation bias in the valuation function cancels when comparing upper confidence bounds, eliminating the need to know the Lipschitz constant. The framework extends beyond linear models to general function classes through offline regression oracles. Our regret analysis depends solely on the oracle's estimation error, typically governed by the statistical complexity of the class. These techniques yield a regret upper bound matching the minimax lower bound up to logarithmic factors. Furthermore, we refine these guarantees under additional structures—e.g., linear valuation models, second-order smoothness, sparsity, and known noise distribution or observable valuations—and compare our bounds and assumptions with prior dynamic-pricing methods. Finally, numerical experiments corroborate the theory and show clear improvements over benchmark methods.
Journal
Bandit Learning
2026
Dual-Directed Algorithm Design for Efficient Pure Exploration
Chao Qin and Wei You
Operations Research, 74(2), 1104–1125
Abstract
While experimental design often focuses on selecting the single best alternative from a finite set (e.g., in ranking and selection or best-arm identification), many pure-exploration problems pursue richer goals. Given a specific goal, adaptive experimentation aims to achieve it by strategically allocating sampling effort, with the underlying sample complexity characterized by a maximin optimization problem. By introducing dual variables, we derive necessary and sufficient conditions for an optimal allocation, yielding a unified algorithm design principle that extends the top-two approach beyond best-arm identification. This principle gives rise to Information-Directed Selection, a hyperparameter-free rule that dynamically evaluates and chooses among candidates based on their current informational value. We prove that, when combined with Information-Directed Selection, top-two Thompson sampling attains asymptotic optimality for Gaussian best-arm identification, resolving a notable open question in the pure-exploration literature. Furthermore, our framework produces asymptotically optimal algorithms for pure-exploration thresholding bandits and $\varepsilon$-best-arm identification (i.e., ranking and selection with probability-of-good-selection guarantees), and more generally establishes a recipe for adapting Thompson sampling across a broad class of pure-exploration problems. Extensive numerical experiments highlight the efficiency of our proposed algorithms compared to existing methods.
Conference
Bandit Learning
2023
Information-Directed Selection for Top-Two Algorithms
Wei You, Chao Qin, Zihao Wang, and Shuoguang Yang
Conference on Learning Theory (COLT 2023), 2850–2851
Abstract
We consider the best-$k$-arm identification problem for multi-armed bandits, where the objective is to select the exact set of $k$ arms with the highest mean rewards by sequentially allocating measurement effort. We characterize the necessary and sufficient conditions for the optimal allocation using dual variables. Remarkably these optimality conditions lead to the extension of top-two algorithm design principle (Russo, 2020), initially proposed for best-arm identification. Furthermore, our optimality conditions induce a simple and effective selection rule dubbed information-directed selection (IDS) that selects one of the top-two candidates based on a measure of information gain. As a theoretical guarantee, we prove that integrated with IDS, top-two Thompson sampling is (asymptotically) optimal for Gaussian best-arm identification, solving a glaring open problem in the pure exploration literature (Russo, 2020). As a by-product, we show that for $k > 1$, top-two algorithms cannot achieve optimality even when the algorithm has access to the unknown optimal tuning parameter. Numerical experiments show the superior performance of the proposed top-two algorithms with IDS and considerable improvement compared with algorithms without adaptive selection.
Journal
Queueing Theory
2022
New decomposition approximations for queueing network
Ward Whitt and Wei You
Queueing Systems, 100(3), 365–367
Journal
Queueing Theory
2022
A Robust Queueing Network Analyzer Based on Indices of Dispersion
Ward Whitt and Wei You
Naval Research Logistics, 69(1), 36–56
Abstract
We develop a robust queueing network analyzer algorithm to approximate the steady-state performance of a single-class open queueing network of single-server queues with Markovian routing. The algorithm allows nonrenewal external arrival processes, general service-time distributions and customer feedback. The algorithm is based on a decomposition approximation, where each flow is partially characterized by its rate and a continuous function that measures the stochastic variability over time. This function is a scaled version of the variance-time curve, called the index of dispersion for counts (IDC). The required IDC functions for the external arrival processes can be calculated from the model primitives or estimated from data. Approximations for the IDC functions of the internal flows are calculated by solving a set of linear equations. The theoretical basis is provided by heavy-traffic limits for the flows established in our previous papers. A robust queueing technique is used to generate approximations of the mean steady-state performance at each queue from the IDC of the total arrival flow and the service specification at that queue. The algorithm's effectiveness is supported by extensive simulation studies.
Journal
Queueing Theory
2020
Heavy-Traffic Limits for Stationary Network Flows
Ward Whitt and Wei You
Queueing Systems, 95(1), 53–68
Abstract
This paper studies stationary customer flows in an open queueing network. The flows are the processes counting customers flowing from one queue to another or out of the network. We establish the existence of unique stationary flows in generalized Jackson networks and convergence to the stationary flows as time increases. We establish heavy-traffic limits for the stationary flows, allowing an arbitrary subset of the queues to be critically loaded. The heavy-traffic limit with a single bottleneck queue is especially tractable because it yields limit processes involving one-dimensional reflected Brownian motion. That limit plays an important role in our new nonparametric decomposition approximation of the steady-state performance using indices of dispersion and robust optimization.
Journal
Queueing Theory
2019
Time-Varying Robust Queueing
Ward Whitt and Wei You
Operations Research, 67(6), 1766–1782
Abstract
We develop a time-varying robust-queueing (TVRQ) algorithm for the continuous-time workload in a single-server queue with a time-varying arrival-rate function. We apply this TVRQ to develop approximations for the periodic steady-state expected workload in models with a periodic arrival-rate function. We apply simulation and asymptotic methods to examine the performance of periodic TVRQ (PRQ). We find that PRQ predicts the mean of the periodic distribution and even the full distribution (specified by the quantiles) remarkably well. We show that the PRQ converges to a proper limit in appropriate long-cycle and heavy-traffic regimes and coincides with long-cycle fluid limits and heavy-traffic diffusion limits for long cycles.
Journal
Queueing Theory
2019
The Advantage of Indices of Dispersion in Queueing Approximations
Ward Whitt and Wei You
Operations Research Letters, 47(2), 99–104
Abstract
A recent robust queueing approximation for open queueing networks exploits partial characterizations of each arrival process by its rate and index of dispersion for counts (IDC), which is a scaled version of the variance–time curve. Even though only means and variances (as functions of time) are involved, we show that the IDC provides a basis for more accurate approximations than traditional two-moment partial characterizations. For the GI/GI/1 queue, this approach applied to the arrival and service processes fully characterizes the model.
Thesis
Queueing Theory
2019
A Robust Queueing Network Analyzer Based on Indices of Dispersion
Wei You
Ph.D. Thesis, Columbia University.
Abstract
In post-industrial economies, modern service systems are dramatically changing the daily lives of many people. Such systems are often complicated by uncertainty: service providers usually cannot predict when a customer will arrive and how long the service will be. Fortunately, useful guidance can often be provided by exploiting stochastic models such as queueing networks. In iterating the design of service systems, decision makers usually favor analytical analysis of the models over simulation methods, due to the prohibitive computation time required to obtain optimal solutions for service operation problems involving multidimensional stochastic networks. However, queueing networks that can be solved analytically require strong assumptions that are rarely satisfied, whereas realistic models that exhibit complicated dependence structure are prohibitively hard to analyze exactly. In this thesis, we continue the effort to develop useful analytical performance approximations for the single-class open queueing network with Markovian routing, unlimited waiting space and the first-come first-served service discipline. We focus on open queueing networks where the external arrival processes are not Poisson and the service times are not exponential. We develop a new non-parametric robust queueing algorithm for the performance approximation in single-server queues. With robust optimization techniques, the underlying stochastic processes are replaced by samples from suitably defined uncertainty sets and the worst-case scenario is analyzed. We show that this worst-case characterization of the performance measure is asymptotically exact for approximating the mean steady-state workload in G/G/1 models in both the light-traffic and heavy-traffic limits, under mild regularity conditions. In our non-parametric Robust Queueing formulation, we focus on the customer flows, defined as the continuous-time processes counting customers in or out of the network, or flowing from one queue to another. Each flow is partially characterized by a continuous function that measures the change of stochastic variability over time. This function is called the index of dispersion for counts. The Robust Queueing algorithm converts the index of dispersion for counts into approximations of the performance measures. We show the advantage of using index of dispersion for counts in queueing approximation by a renewal process characterization theorem and the ordering of the mean steady-state workload in GI/M/1 models. To develop generalized algorithm for open queueing networks, we first establish the heavy-traffic limit theorem for the stationary departure flows from a GI/GI/1 model. We show that the index of dispersion for counts function of the stationary departure flow can be approximately characterized as the convex combination of the arrival index of dispersion for counts and service index of dispersion for counts with a time-dependent weight function, revealing the non-trivial impact of the traffic intensity on the departure processes. This heavy-traffic limit theorem is further generalized into a joint heavy-traffic limit for the stationary customer flows in generalized Jackson networks, where the external arrival are characterized by independent renewal processes and the service times are independent and identically distributed random variables, independent of the external arrival processes. We show how these limiting theorems can be exploited to establish a set of linear equations, whose solution serves as approximations of the index of dispersion for counts of the flows in an open queueing network. We prove that this set of equations is asymptotically exact in approximating the index of dispersion for counts of the stationary flows. With the index of dispersion for counts available, the network is decomposed into single-server queues and the Robust Queueing algorithm can be applied to obtain performance approximation. This algorithm is referred to as the Robust Queueing Network Analyzer. We perform extensive simulation study to validate the effectiveness of our algorithm. We show that our algorithm can be applied not only to models with non-exponential distirbutions but also to models with more complex arrival processes than renewal processes, including those with Markovian arrival processes.
Journal
Queueing Theory
2018
Heavy-Traffic Limit of the GI/GI/1 Stationary Departure Process and its Variance Function
Ward Whitt and Wei You
Stochastic Systems, 8(2), 143–165
Abstract
Heavy-traffic limits are established for the stationary departure process from a GI/GI/1 queue and its variance function. The limit process is a function of the Brownian motion limits of the arrival and service processes plus the stationary reflected Brownian motion (RBM) limit of the queue-length process. An explicit expression is given for the variance function, which depends only on the first two moments of the interarrival times and service times plus the previously determined correlation function of canonical (drift −1, diffusion coefficient 1) RBM. The limit for the variance function here is used to show that the approximation for the index of dispersion for counts of the departure process used in our new robust queueing network analyzer is asymptotically correct in the heavy-traffic limit.
Journal
Queueing Theory
2018
Using Robust Queueing to Expose the Impact of Dependence in Single-Server Queues
Ward Whitt and Wei You
Operations Research, 66(1), 184–199
Abstract
Queueing applications are often complicated by dependence among interarrival times and service times. Such dependence is common in networks of queues, where arrivals are departures from other queues or superpositions of such complicated processes, especially when there are multiple customer classes with class-dependent service-time distributions. We show that the robust queueing approach for single-server queues proposed in the literature can be extended to yield improved steady-state performance approximations in the standard stochastic setting that includes dependence among interarrival times and service times. We propose a new functional robust queueing formulation for the steady-state workload that is exact for the steady-state mean in the M/GI/1 model and is asymptotically correct in both heavy traffic and light traffic. Simulation experiments show that it is effective more generally.