Portrait of Wei You

Operations Research · Applied Probability

Wei You 尤为

Assistant Professor
Department of Industrial Engineering and Decision Analytics
The Hong Kong University of Science and Technology

Research Focus: stochastic models, queueing theory, applied probability, sequential decision-making, and their applications to the performance analysis and design of service systems and digital experimentation.

Publications

Research

Working Papers

Working Queueing Theory 2026

Service Rate Differentiation for Homogeneous Impatient Customers

Chenguang (Allen) Wu and Wei You

Major revision at Production and Operations Management

Abstract
We study joint service rate and waiting time differentiation for homogeneous impatient customers in many-server quality-based queueing systems, where longer services generate higher values but customers may abandon if they wait too long. Customers are homogeneous upon arrival, but the system manager can differentiate them in two dimensions: by assigning each customer to one of many service grades corresponding to different service rates, and by using customers' elapsed waiting times as a scheduling signal to further differentiate those who are waiting in queue. The manager jointly chooses the service grades, an allocation rule that assigns arriving customers to grades, and a scheduling policy that prioritizes customers across and within each grade. Because an exact stochastic analysis of this joint optimization problem is intractable, we adopt a fluid approach and translate the fluid solution into implementable policies for the underlying stochastic system. Our main structural result is that there exists an optimal fluid policy with at most two active service rate and offered wait pairs, regardless of the welfare function or patience time distribution. This reduces the complicated design problem to operationally simple policies. Motivated by our fluid analysis, we propose using a randomized grade assignment at customer arrival and a priority rule based on grade labels and elapsed waiting times, and establish a formal performance guarantee for this policy in Markovian systems. Our framework can easily extend to heterogeneous customers, where we show at most one customer class needs to be further differentiated, thereby suggesting a robust principle of implementing joint service rate and waiting time differentiation.
Working Bandit Learning 2026

Learning to Optimize Service Matching with Unknown Rewards under Operational Constraints

Zhihao Li, Shining Wu, Wei You, Jiheng Zhang, and Rachel Q. Zhang

Under review at Operations Research

Abstract
Online learning methodologies have been extensively applied to optimize sequential decisions under uncertainty to maximize long-run rewards. However, traditional online learning methods often assume decisions are isolated and have instantaneous outcomes. When applied to environments with underlying system dynamics—such as service systems where customers wait to be matched to limited service capacity—these methods overlook crucial operational constraints like server availability, queue length, customer waiting costs, and abandonment. To capture these operational realities, we study a multi-class, multi-server pool queueing system where each customer class is characterized by a known feature vector, waiting cost, and abandonment rate, while each server pool has an unknown expertise parameter and distinct service rate. The reward for matching a customer to a server depends on their compatibility, modeled as the inner product of the customer's feature vector and the server's expertise parameter. To tackle these operational intricacies, we first derive the $sv$-rule, a scheduling rule with strategic reservations that uses fluid approximation to evaluate the long-term effect of each matching decision and guide matching selection when only a subset of servers is available. The $sv$-rule achieves asymptotic optimality when server parameters are known and provides the foundation for our learning algorithms. Building on this foundation, we design two learning algorithms, res-UCB and res-IDS-UCB, which integrate parameter estimation with the $sv$-rule framework. Both achieve polylogarithmic regret $O(\ln^3 T)$; res-IDS-UCB further reduces the regret constant by prioritizing matches that yield the maximum information gain about the unknown server expertise parameters, rather than exploring uniformly at random, improving from $O(IJ \ln^3 T)$ to $O((I+J) \ln^3 T)$. Numerical experiments confirm our theoretical findings, demonstrating that traditional exploration methods suffer linear regret when neglecting operational dynamics, whereas our proposed algorithms successfully achieve sublinear regret.
Working LLM Inference 2026

Multi Heterogeneous Clusters PD-Disaggregated LLM Inference

Ruihan Lin, Wei You, and Jiheng Zhang

Preparing submission to Operations Research

Working Transfer Learning and Causal Inference 2026

Principal Fairness with Optimal Transport for Heterogeneous Population

Huichen Zhu, Wei You, and Annie Qu

Under review at Journal of the American Statistical Association

Abstract
Algorithmic fairness has attracted growing attention as data-driven decision systems are increasingly deployed in high-stakes domains such as healthcare. Most existing work focuses on associational fairness or counterfactual fairness grounded in structural causal models. Yet in clinical decision-making, fairness should be judged by actions: who receives which treatment and how those choices causally shape benefits and harms across groups. Associational criteria can hold while treatment allocation remains inequitable, and counterfactual notions often rely on strong, untestable modeling assumptions. We propose a two-stage framework that first predicts individual potential outcomes and then enforces a causal fairness constraint on treatment recommendations by aligning decision-relevant representations across sensitive groups within strata defined by predicted potential outcomes. Simulations show near-perfect principal fairness with predictive performance close to the first-stage estimators despite information loss from transport-based alignment. We further apply the method to the Medical Information Mart for Intensive Care IV (MIMIC-IV) database and obtain clinically plausible predicted strata proportions. Because true strata are unobserved, we assess fairness using surrogate criteria, including statistical parity and a decomposition-based estimate of the direct causal effect of the sensitive attribute on treatment assignment, and observe improved parity under these measures. Overall, our results support causal fairness as a practical objective for equitable clinical decision policies under complex time-series data.
Working Queueing Theory 2026

Robust Queueing for Single-Server Queues with Abandonment

Wei You

Under review at Queueing Systems

Abstract
Single-server queues with customer abandonment arise in call centers and many service systems, but steady-state performance measures remain analytically intractable beyond Markovian assumptions. This paper develops Robust Queueing (RQ) approximations for the mean steady-state virtual waiting time (offered waiting time) in the $GI/GI/1{+}GI$ model. The approach starts from a reverse-time supremum representation of the virtual waiting time as the reflection of an effective net-input process that accounts for abandonments. We approximate effective net-input increments by their mean plus a robustness parameter times their standard deviation. For the drift, we introduce a Poisson-surrogate compensator and show that the associated correction term is asymptotically negligible in the long-patience regime. For variability, we propose two implementable surrogates: (i) a deterministic time-change approximation that yields a first RQ algorithm, and (ii) a refined algorithm based on a heavy-traffic limit that produces a scale-dependent variance function capturing the variance-reduction effect of abandonment. The resulting steady-state approximation reduces to a one-dimensional fixed point solvable by bisection and takes as input the arrival index of dispersion for counts (IDC), the service-time squared coefficient of variation, and the patience-time distribution. We further show how to extend the method to queues in series by feeding an approximation of the upstream departure IDC into the downstream RQ algorithm. Extensive numerical experiments demonstrate that the refined RQ approximation is accurate across underload, critical loading, and overload, and remains robust relative to existing heavy-traffic and hazard-rate-scaling benchmarks.
Working Bandit Learning 2026

Cost-Aware Exploration for Structured Bandits

Xinyu Liu, Wei You, and Chao Qin

Preparing submission to Operations Research

Abstract
We study pure exploration with heterogeneous per-measurement costs. An agent sequentially selects among $K$ arms whose observation laws belong to (possibly structured) canonical exponential families, and each pull incurs an arm- and instance-dependent cost. The goal is to identify an instance-dependent answer (e.g., best arm, thresholding bandits, or Pareto set identification) while maximizing the rate at which the posterior probability of error decays per unit spent budget. We characterize the optimal cost-normalized posterior error exponent as the value of a maximin program that trades off statistical discrimination against the average cost, and show that no adaptive sampling rule can exceed this exponent. By working with the cost normalization, the exponent is characterized by a concave maximization problem. Motivated by this characterization, we develop a cost-aware pure exploration algorithm. The resulting dynamics is inherently nonsmooth and set-valued due to argmin operations, boundary effects, and normalization. We analyze the stochastic iterates through a continuous-time approximation based on differential inclusions and prove that, under mild regularity conditions, the algorithm attains the optimal cost-normalized posterior error exponent almost surely. Our results provide a general asymptotic optimality guarantee for cost-aware pure exploration beyond best-arm identification, covering broad exploration queries, exponential-family rewards, and structured bandit models, with substantially lower per-iteration computational overhead than optimize-then-track baselines such as Track-and-Stop.
Working Bandit Learning 2026

A Multi-Armed Bandit Approach to Online Resource Allocation for Service Platforms

Zihao Wang and Wei You

Major revision at Naval Research Logistics

Working Bandit Learning 2025

Pure Exploration via Frank–Wolfe Self-Play

Xinyu Liu, Chao Qin, and Wei You

Preparing submission to Journal of Machine Learning Research

Short version appeared at NeurIPS 2025 MLxOR Workshop.

Abstract
We study pure exploration in structured stochastic multi-armed bandits, aiming to efficiently identify the correct hypothesis from a finite set of alternatives. For a broad class of tasks, asymptotic analyses reduce to a maximin optimization that admits a two-player zero-sum game interpretation between an experimenter and a skeptic: the experimenter allocates measurements to rule out alternatives while the skeptic proposes alternatives. We reformulate the game by allowing the skeptic to adopt a mixed strategy, yielding a concave–convex saddle-point problem. This viewpoint leads to Frank–Wolfe Self-Play (FWSP): a projection-free, regularization-free, tuning-free method whose one-hot updates on both sides match the bandit sampling paradigm. However, structural constraints introduce sharp pathologies that complicate algorithm design and analysis: our linear-bandit case study exhibits nonunique optima, optimal designs with zero mass on the best arm, bilinear objectives, and nonsmoothness at the boundary. We address these challenges via a differential-inclusion argument, proving convergence of the game value for best-arm identification in linear bandits. Our analysis proceeds through a continuous-time limit: a differential inclusion with a Lyapunov function that decays exponentially, implying a vanishing duality gap and convergence to the optimal value. Although Lyapunov analysis requires differentiability of the objective, which is not guaranteed on the boundary, we show that along continuous trajectories the algorithm steers away from pathological nonsmooth points and achieves uniform global convergence to the optimal game value. We then embed the discrete-time updates into a perturbed flow and show that the discrete game value also converges. Building on FWSP, we further propose a learning algorithm based on posterior sampling. Numerical experiments demonstrate game-value convergence with a vanishing duality gap.
Working Transfer Learning and Causal Inference 2025

Efficient Transfer Learning via Causal Bounds

Xueping Gong, Wei You, and Jiheng Zhang

Major revision at Operations Research

Abstract
Transfer learning seeks to accelerate sequential decision-making by leveraging offline data from related agents. However, data from heterogeneous sources that differ in observed features, logging policies, or unobserved confounders often render causal effects non-identifiable and bias naive estimators. We address this by forming ambiguity sets of structural causal models defined via integral constraints on their joint densities. Optimizing any causal effect over these sets leads to generally non-convex programs whose solutions tightly bound the range of possible effects under heterogeneity or confounding. To solve these programs efficiently, we develop a sample-then-optimize framework to produce causal bound estimates that converge to the true limits and quantify the optimality gaps by outer relaxation. We further accommodate estimation error by relaxing the ambiguity set and exploit the Lipschitz continuity of causal effects to establish precise error propagation guarantees. These causal bounds are then embedded into bandit algorithms via arm elimination and truncated UCB indices, yielding optimal gap-dependent and minimax regret bounds. To handle estimation error, we also develop a safe algorithm for incorporating noisy causal bounds. In the contextual-bandit setting with function approximation, our method uses causal bounds to prune both the function class and the per-context action set, achieving matching upper and lower regret bounds with only logarithmic dependence on function-class complexity. Our analysis precisely characterizes when and how causal side-information accelerates online learning, and experiments on synthetic benchmarks confirm substantial regret reductions in confounded regimes.

Publications

Conference Bandit Learning 2026

Ranking-and-Selection with Multiple Correct Answers and Non-Answerable Estimates

Qiaoqiao Wang and Wei You

Winter Simulation Conference (WSC 2026)

Abstract
We study fixed-precision ranking-and-selection in structured settings where the answer may be non-unique and where noisy estimates may temporarily admit no valid answer at all. This phenomenon arises naturally in problems such as multi-fidelity ranking-and-selection and identifying a Condorcet winner from pairwise comparisons. To address this, we propose a unified framework based on answer-wise acceptance sets, restricted generalized likelihood ratio stopping, and an answer–pitfall decomposition that yields a max-max-min characteristic value and a common sampling principle. We introduce ENDS, a general procedure that combines estimation, nomination, pitfall detection, and cost-aware information-directed sampling. We instantiate ENDS for various problems by deriving explicit formulas. Extensive numerical experiments show that this unified recipe performs well across a broad range of pure-exploration problems and offers a practical proof of concept for a common methodology.
Journal Stochastic Optimization 2026

Stochastic Compositional Optimization with Compositional Constraints

Shuoguang Yang, Wei You, Zhe Zhang, and Ethan X. Fang

INFORMS Journal on Optimization, 8(1), 1–21

Abstract
Stochastic compositional optimization (SCO) has attracted considerable attention due to its broad applicability to important real-world problems. However, existing work on SCO typically assumes that the projection within a solution update is straightforward, which is not the case for problem instances where constraints are in the form of expectations, such as empirical conditional value-at-risk constraints. In this paper, we introduce a novel model that integrates single-level expected value and two-level compositional constraints into the existing SCO framework. Our model has wide applicability to data-driven optimization, fairness optimization, and risk management, including risk-averse optimization and high-moment portfolio selection, and is capable of handling multiple constraints. Additionally, we propose a class of primal-dual algorithms that generate sequences converging to the optimal solution at a rate of $\mathcal{O}(1/\sqrt{N})$ under both single-level and two-level compositional expected-value constraints, where $N$ is the iteration counter, thus establishing benchmarks in expected value constrained SCO. Numerical experiments show the efficiency of our algorithm over real-world applications.
Journal Bandit Learning 2026

Minimax Optimality in Contextual Dynamic Pricing with General Valuation Models

Xueping Gong, Wei You, and Jiheng Zhang

Operations Research, 74(2), 879–897

Abstract
We study contextual dynamic pricing, where a decision maker posts personalized prices based on observable contexts and receives binary purchase feedback indicating whether the customer's valuation exceeds the price. Each valuation is modeled as an unknown latent function of the context, corrupted by independent and identically distributed market noise from an unknown distribution. Relying only on Lipschitz continuity of the noise distribution and bounded valuations, we propose a minimax-optimal algorithm. To accommodate the unknown distribution, our method discretizes the relevant noise range to form a finite set of candidate prices, then applies layered data partitioning to obtain confidence bounds substantially tighter than those derived via the elliptical-potential lemma. A key advantage is that estimation bias in the valuation function cancels when comparing upper confidence bounds, eliminating the need to know the Lipschitz constant. The framework extends beyond linear models to general function classes through offline regression oracles. Our regret analysis depends solely on the oracle's estimation error, typically governed by the statistical complexity of the class. These techniques yield a regret upper bound matching the minimax lower bound up to logarithmic factors. Furthermore, we refine these guarantees under additional structures—e.g., linear valuation models, second-order smoothness, sparsity, and known noise distribution or observable valuations—and compare our bounds and assumptions with prior dynamic-pricing methods. Finally, numerical experiments corroborate the theory and show clear improvements over benchmark methods.
Journal Bandit Learning 2026

Dual-Directed Algorithm Design for Efficient Pure Exploration

Chao Qin and Wei You

Operations Research, 74(2), 1104–1125

Abstract
While experimental design often focuses on selecting the single best alternative from a finite set (e.g., in ranking and selection or best-arm identification), many pure-exploration problems pursue richer goals. Given a specific goal, adaptive experimentation aims to achieve it by strategically allocating sampling effort, with the underlying sample complexity characterized by a maximin optimization problem. By introducing dual variables, we derive necessary and sufficient conditions for an optimal allocation, yielding a unified algorithm design principle that extends the top-two approach beyond best-arm identification. This principle gives rise to Information-Directed Selection, a hyperparameter-free rule that dynamically evaluates and chooses among candidates based on their current informational value. We prove that, when combined with Information-Directed Selection, top-two Thompson sampling attains asymptotic optimality for Gaussian best-arm identification, resolving a notable open question in the pure-exploration literature. Furthermore, our framework produces asymptotically optimal algorithms for pure-exploration thresholding bandits and $\varepsilon$-best-arm identification (i.e., ranking and selection with probability-of-good-selection guarantees), and more generally establishes a recipe for adapting Thompson sampling across a broad class of pure-exploration problems. Extensive numerical experiments highlight the efficiency of our proposed algorithms compared to existing methods.
Conference Bandit Learning 2023

Information-Directed Selection for Top-Two Algorithms

Wei You, Chao Qin, Zihao Wang, and Shuoguang Yang

Conference on Learning Theory (COLT 2023), 2850–2851

Abstract
We consider the best-$k$-arm identification problem for multi-armed bandits, where the objective is to select the exact set of $k$ arms with the highest mean rewards by sequentially allocating measurement effort. We characterize the necessary and sufficient conditions for the optimal allocation using dual variables. Remarkably these optimality conditions lead to the extension of top-two algorithm design principle (Russo, 2020), initially proposed for best-arm identification. Furthermore, our optimality conditions induce a simple and effective selection rule dubbed information-directed selection (IDS) that selects one of the top-two candidates based on a measure of information gain. As a theoretical guarantee, we prove that integrated with IDS, top-two Thompson sampling is (asymptotically) optimal for Gaussian best-arm identification, solving a glaring open problem in the pure exploration literature (Russo, 2020). As a by-product, we show that for $k > 1$, top-two algorithms cannot achieve optimality even when the algorithm has access to the unknown optimal tuning parameter. Numerical experiments show the superior performance of the proposed top-two algorithms with IDS and considerable improvement compared with algorithms without adaptive selection.
Journal Queueing Theory 2022

New decomposition approximations for queueing network

Ward Whitt and Wei You

Queueing Systems, 100(3), 365–367

Journal Queueing Theory 2022

A Robust Queueing Network Analyzer Based on Indices of Dispersion

Ward Whitt and Wei You

Naval Research Logistics, 69(1), 36–56

Abstract
We develop a robust queueing network analyzer algorithm to approximate the steady-state performance of a single-class open queueing network of single-server queues with Markovian routing. The algorithm allows nonrenewal external arrival processes, general service-time distributions and customer feedback. The algorithm is based on a decomposition approximation, where each flow is partially characterized by its rate and a continuous function that measures the stochastic variability over time. This function is a scaled version of the variance-time curve, called the index of dispersion for counts (IDC). The required IDC functions for the external arrival processes can be calculated from the model primitives or estimated from data. Approximations for the IDC functions of the internal flows are calculated by solving a set of linear equations. The theoretical basis is provided by heavy-traffic limits for the flows established in our previous papers. A robust queueing technique is used to generate approximations of the mean steady-state performance at each queue from the IDC of the total arrival flow and the service specification at that queue. The algorithm's effectiveness is supported by extensive simulation studies.
Journal Queueing Theory 2020

Heavy-Traffic Limits for Stationary Network Flows

Ward Whitt and Wei You

Queueing Systems, 95(1), 53–68

Abstract
This paper studies stationary customer flows in an open queueing network. The flows are the processes counting customers flowing from one queue to another or out of the network. We establish the existence of unique stationary flows in generalized Jackson networks and convergence to the stationary flows as time increases. We establish heavy-traffic limits for the stationary flows, allowing an arbitrary subset of the queues to be critically loaded. The heavy-traffic limit with a single bottleneck queue is especially tractable because it yields limit processes involving one-dimensional reflected Brownian motion. That limit plays an important role in our new nonparametric decomposition approximation of the steady-state performance using indices of dispersion and robust optimization.
Journal Queueing Theory 2019

Time-Varying Robust Queueing

Ward Whitt and Wei You

Operations Research, 67(6), 1766–1782

Abstract
We develop a time-varying robust-queueing (TVRQ) algorithm for the continuous-time workload in a single-server queue with a time-varying arrival-rate function. We apply this TVRQ to develop approximations for the periodic steady-state expected workload in models with a periodic arrival-rate function. We apply simulation and asymptotic methods to examine the performance of periodic TVRQ (PRQ). We find that PRQ predicts the mean of the periodic distribution and even the full distribution (specified by the quantiles) remarkably well. We show that the PRQ converges to a proper limit in appropriate long-cycle and heavy-traffic regimes and coincides with long-cycle fluid limits and heavy-traffic diffusion limits for long cycles.
Journal Queueing Theory 2019

The Advantage of Indices of Dispersion in Queueing Approximations

Ward Whitt and Wei You

Operations Research Letters, 47(2), 99–104

Abstract
A recent robust queueing approximation for open queueing networks exploits partial characterizations of each arrival process by its rate and index of dispersion for counts (IDC), which is a scaled version of the variance–time curve. Even though only means and variances (as functions of time) are involved, we show that the IDC provides a basis for more accurate approximations than traditional two-moment partial characterizations. For the GI/GI/1 queue, this approach applied to the arrival and service processes fully characterizes the model.
Thesis Queueing Theory 2019

A Robust Queueing Network Analyzer Based on Indices of Dispersion

Wei You

Ph.D. Thesis, Columbia University.

Abstract
In post-industrial economies, modern service systems are dramatically changing the daily lives of many people. Such systems are often complicated by uncertainty: service providers usually cannot predict when a customer will arrive and how long the service will be. Fortunately, useful guidance can often be provided by exploiting stochastic models such as queueing networks. In iterating the design of service systems, decision makers usually favor analytical analysis of the models over simulation methods, due to the prohibitive computation time required to obtain optimal solutions for service operation problems involving multidimensional stochastic networks. However, queueing networks that can be solved analytically require strong assumptions that are rarely satisfied, whereas realistic models that exhibit complicated dependence structure are prohibitively hard to analyze exactly. In this thesis, we continue the effort to develop useful analytical performance approximations for the single-class open queueing network with Markovian routing, unlimited waiting space and the first-come first-served service discipline. We focus on open queueing networks where the external arrival processes are not Poisson and the service times are not exponential. We develop a new non-parametric robust queueing algorithm for the performance approximation in single-server queues. With robust optimization techniques, the underlying stochastic processes are replaced by samples from suitably defined uncertainty sets and the worst-case scenario is analyzed. We show that this worst-case characterization of the performance measure is asymptotically exact for approximating the mean steady-state workload in G/G/1 models in both the light-traffic and heavy-traffic limits, under mild regularity conditions. In our non-parametric Robust Queueing formulation, we focus on the customer flows, defined as the continuous-time processes counting customers in or out of the network, or flowing from one queue to another. Each flow is partially characterized by a continuous function that measures the change of stochastic variability over time. This function is called the index of dispersion for counts. The Robust Queueing algorithm converts the index of dispersion for counts into approximations of the performance measures. We show the advantage of using index of dispersion for counts in queueing approximation by a renewal process characterization theorem and the ordering of the mean steady-state workload in GI/M/1 models. To develop generalized algorithm for open queueing networks, we first establish the heavy-traffic limit theorem for the stationary departure flows from a GI/GI/1 model. We show that the index of dispersion for counts function of the stationary departure flow can be approximately characterized as the convex combination of the arrival index of dispersion for counts and service index of dispersion for counts with a time-dependent weight function, revealing the non-trivial impact of the traffic intensity on the departure processes. This heavy-traffic limit theorem is further generalized into a joint heavy-traffic limit for the stationary customer flows in generalized Jackson networks, where the external arrival are characterized by independent renewal processes and the service times are independent and identically distributed random variables, independent of the external arrival processes. We show how these limiting theorems can be exploited to establish a set of linear equations, whose solution serves as approximations of the index of dispersion for counts of the flows in an open queueing network. We prove that this set of equations is asymptotically exact in approximating the index of dispersion for counts of the stationary flows. With the index of dispersion for counts available, the network is decomposed into single-server queues and the Robust Queueing algorithm can be applied to obtain performance approximation. This algorithm is referred to as the Robust Queueing Network Analyzer. We perform extensive simulation study to validate the effectiveness of our algorithm. We show that our algorithm can be applied not only to models with non-exponential distirbutions but also to models with more complex arrival processes than renewal processes, including those with Markovian arrival processes.
Journal Queueing Theory 2018

Heavy-Traffic Limit of the GI/GI/1 Stationary Departure Process and its Variance Function

Ward Whitt and Wei You

Stochastic Systems, 8(2), 143–165

Abstract
Heavy-traffic limits are established for the stationary departure process from a GI/GI/1 queue and its variance function. The limit process is a function of the Brownian motion limits of the arrival and service processes plus the stationary reflected Brownian motion (RBM) limit of the queue-length process. An explicit expression is given for the variance function, which depends only on the first two moments of the interarrival times and service times plus the previously determined correlation function of canonical (drift −1, diffusion coefficient 1) RBM. The limit for the variance function here is used to show that the approximation for the index of dispersion for counts of the departure process used in our new robust queueing network analyzer is asymptotically correct in the heavy-traffic limit.
Journal Queueing Theory 2018

Using Robust Queueing to Expose the Impact of Dependence in Single-Server Queues

Ward Whitt and Wei You

Operations Research, 66(1), 184–199

Abstract
Queueing applications are often complicated by dependence among interarrival times and service times. Such dependence is common in networks of queues, where arrivals are departures from other queues or superpositions of such complicated processes, especially when there are multiple customer classes with class-dependent service-time distributions. We show that the robust queueing approach for single-server queues proposed in the literature can be extended to yield improved steady-state performance approximations in the standard stochastic setting that includes dependence among interarrival times and service times. We propose a new functional robust queueing formulation for the steady-state workload that is exact for the steady-state mean in the M/GI/1 model and is asymptotically correct in both heavy traffic and light traffic. Simulation experiments show that it is effective more generally.

Presentations

Talks

Conference Talks

(* Presented by co-authors)

  • Pure Exploration via Frank–Wolfe Self-Play
    POMS-HK 2026, Shenzhen*
    NeurIPS 2025 MLxOR Workshop, San Diego, CA*
    APORS Youth Forum 2025, Hong Kong*
  • Transfer Learning with Partially Observable Offline Data via Causal Bounds
    POMS-HK 2026, Shenzhen*
    EcoSta 2025, Tokyo, Japan
  • Cost-Aware Exploration for Structured Bandits
    2025 INFORMS Applied Probability Conference, Atlanta, GA
    2025 INFORMS International Meeting, Singapore
  • Dual-Directed Algorithm Design for Efficient Pure Exploration
    2024 INFORMS Annual Meeting, Seattle, WA
    IASC-ARS Interim Conference 2024, Taipei City
    JCSDS 2023, Kunming, Yunnan
  • Information-Directed Selection for Pure Exploration
    2023 INFORMS Annual Meeting, Phoenix, AZ
  • Information-Directed Selection for Top-Two Algorithms
    Conference on Learning Theory (COLT) 2023, Bangalore, India
    2023 INFORMS APS Meeting, Nancy, France
    POMS-HK 2023, Hong Kong
  • Approximation for Single-Server Queues with Abandonment via Robust Queueing
    2024 INFORMS MSOM Conference, Minneapolis, MN
    2021 INFORMS Annual Meeting, Anaheim, CA
  • Online Resource Allocation for Service Platforms
    Virtual 2020 INFORMS Annual Meeting
  • Heavy-Traffic Limit for Stationary Network Flows
    2019 INFORMS APS Meeting, Brisbane, Australia
  • Time-Varying Robust Queueing for Quantile Approximation
    POMS-HK 2020, Hong Kong
    2019 INFORMS APS Meeting, Brisbane, Australia
  • A Robust Queueing Network Analyzer Based on Indices of Dispersion
    2018 INFORMS Annual Meeting, Phoenix, AZ
  • A Robust Queueing Network Analyzer for a Series of Single-Server Queues
    2017 INFORMS Annual Meeting, Houston, TX
  • Time-Varying Robust Queueing
    2017 INFORMS APS Meeting, Evanston, IL
  • Using Robust Queueing to Expose the Impact of Dependence in Single Server Queues
    2016 INFORMS Annual Meeting, Nashville, TN

Invited Talks

  • Miami Herbert Business School, University of Miami, 2025
  • Antai College of Economics and Management, Shanghai Jiao Tong University, 2025
  • Institute of Statistics and Big Data, Renmin University of China, 2024
  • School of Statistics and Data Science, Nankai University, 2024
  • School of Management, Fudan University, 2024
  • School of Digital Economics, Shanghai University of Finance and Economics, 2023
  • School of Management and Engineering, Nanjing University, 2017, 2023
  • Institute of Applied Mathematics, Chinese Academy of Sciences, 2019
  • Institute of Mathematical Science, ShanghaiTech University, 2019
  • Industrial Engineering and Decision Analytics, HKUST, 2019
  • Industrial and System Engineering, Virginia Tech, 2019
  • System and Industrial Engineering, University of Arizona, 2019
  • Business School, University of Miami, 2019
  • School of Management, University of Science and Technology of China, 2019
  • Institute for Data and Decision Analytics, Chinese University of Hong Kong, Shenzhen, 2019
  • System Engineering and Engineering Management, Chinese University of Hong Kong, 2019

Courses

Teaching

Coming Soon 2026–present

IEDA3250 · Stochastic Models

HKUST, Department of Industrial Engineering and Decision Analytics

Stochastic-process modeling methods for engineering systems.

Course page coming soon.

Published 2019–present

IEDA5270 · Engineering Statistics

HKUST, Department of Industrial Engineering and Decision Analytics

Graduate-level statistical methodology for data-driven engineering.

Open course page
Published 2020–2025

IEDA2540 · Statistics for Engineers

HKUST, Department of Industrial Engineering and Decision Analytics

Core statistics for engineers: uncertainty modeling, estimation, inference, and regression.

Open course page
Coming Soon 2022

EEMT5500 · Applied Probability, Statistics and Data Analytics

HKUST, Engineering Enterprise Management Program

Applied probability and statistical methods for engineering management contexts.

Course page coming soon.

Published 2011

NJU · Real Analysis Notes

Nanjing University, School of Mathematics

Real analysis notes written at Nanjing University, with an English translation and a companion blog post.

Open course page

Advising

Team

Current Students

Xinyu Liu

Ph.D. (2024–present)

Yueying Hu

Ph.D. (2024–present)

Qiaoqiao Wang

Ph.D. (2023–present)

Ke Lu

MPhil (2023–present)

Former Students

Zihao Wang

Ph.D. (2019–2023)

First placement: Equity Derivatives Quant, Bank of America.

Yingxian Wu

MPhil at HKUST(GZ) (2020–2022), co-advised with Sisi Jian

First placement: Investment Risk Management Assistant, E Fund Management Co., LTD.

Biography

About

Bio

Wei You is an assistant professor of Industrial Engineering and Decision Analytics at the Hong Kong University of Science and Technology. His research focuses on stochastic models, queueing theory, applied probability, sequential decision-making, and their applications to the performance analysis and design of service systems and digital experimentation.

He received his Ph.D. in Operations Research (2019) from Columbia University, advised by Professor Ward Whitt, and his B.S. in Mathematics (2014) from Nanjing University.

Research Interests

Stochastic Models Queueing Theory Applied Probability Service Systems Sequential Decision-Making Digital Experimentation

Writing

Blog

A Real Analysis Note

A real analysis note I wrote in Chinese in 2011, now refreshed with a complete English translation produced with ChatGPT.

math notes Read post