This project considers a constrained optimization problem applied to a stochastic system with dynamic system states that have a stationary state distribution. At each state, an operation is implemented and a corresponding system cost occurs depending on the chosen action; the objective is to minimize the expected cost given service/demand constraints. Solving this problem is challenging and the main difficulty comes from the fact that the state distribution of the system is often unknown a priori and may change over time in practice. Known algorithms that handle this challenge, in particular, Backpressure algorithms, suffer from a slow convergence speed and poor short-term performance. This project instead investigates the value of online learning in optimal stochastic system control. Preliminary results have produced an online-learning technique called dual learning, along with two corresponding online-learning-aided control strategies. Lagrange multipliers form key quantities in solving a constrained optimization problem; the queue vector turns out to play the role of Lagrange multiplier, and thus one can utilize the information of the system dynamics for accelerating the learning of the control algorithm. Based on this insight, we systematically study the value of online learning in optimal stochastic system control in the following thrusts: 1) Fundamental Limits: We investigate the performance limits of the proposed online-learning-based-strategies, in particular, how fast is it possible for any control scheme to converge to the optimal and what is the corresponding utility-delay tradeoff? 2) Control with partially observable states: We study the practical scenario where one has uncertainty in the cost function of an action. We plan to not only develop efficient algorithms, but also conduct utility-delay tradeoff analysis and regret analysis for the proposed schemes. 3) Performance Evaluation: We evaluate the performance of the proposed algorithms under various settings and through trace-driven evaluations to compare their pros and cons.
Because the generality and importance of such stochastic system optimization problems, the proposed approaches have the potential to be applied in different areas, such as display-advertisement allocation, wireless network control, and Smart Grids. We leverage on-going collaborations with industry to disseminate the research results in real applications. We continue the effort in recruiting and training undergraduate researchers and under-represented groups through this project.