18th INFORMS Computing Society (ICS) Conference
Toronto, Canada, 14 — 16 March 2025
18th INFORMS Computing Society (ICS) Conference
Toronto, Canada, 14 — 16 March 2025

Approximate Dynamic Programming and Applications
Mar 14, 2025 01:00 PM – 02:30 PM
Location: Burwash
Chaired by Berk Gorgulu
4 Presentations
-
01:00 PM - 01:22 PM
Integrated Dynamic Pricing and Dispatching for Crowd-Shipping via Neural Approximate Dynamic Programming
Crowd-shipping, which utilizes non-professional couriers for deliveries, presents a scalable and sustainable alternative to traditional logistics models. This work introduces a Neural Approximate Dynamic Programming (NeurADP) framework, which combines ADP and deep reinforcement learning (DRL), to optimize crowd-shipping operations. More specifically, we integrate dynamic pricing and dispatching to enhance efficiency and profitability. Extending prior work, our model incorporates features such as batching and courier queues, addressing the complexities of large-scale operations. Our experiments comparing NeurADP policy with myopic and DRL baselines demonstrate significant improvements in operational efficiency, cost reduction, and sustainability. Sensitivity analyses highlight the impact of pricing strategies, courier availability, and delivery constraints, offering actionable insights for real-world applications.
-
01:22 PM - 01:44 PM
Neural Approximate Dynamic Programming for the Ultra-fast Order Dispatching Problem
Same-Day Delivery (SDD) services aim to optimize the speed of online order fulfillment while taking into account operational challenges such as fluctuations in order volumes and uncertainties in courier schedules. Our work enhances the effectiveness of SDD by concentrating on the ultra-fast Order Dispatching Problem (ODP), which entails the rapid assignment and dispatching of orders to couriers within a central warehouse environment, aiming to complete deliveries within tight deadlines (e.g., within minutes). We introduce improvements such as order grouping and specific courier allocations, offering a more accurate depiction of dispatch operations and boosting delivery performance. Our approach primarily employs NeurADP, a novel technique that merges Approximate Dynamic Programming with Deep Reinforcement Learning (DRL), marking its first application beyond ride-pool matching. NeurADP effectively handles the complex matching and routing challenges inherent in ultra-fast ODP through a neural network-based Value Function Approximation that adeptly manages high-dimensional problem complexities without the need for traditional manual feature engineering. We evaluate our method on four specialized datasets designed for ODP, assessing NeurADP’s performance against standard myopic and DRL benchmarks and utilizing derived bounds to examine policy effectiveness. The numerical results show that incorporating order batching and courier queues significantly boosts delivery efficiency, with NeurADP outperforming alternative approaches.
-
01:44 PM - 02:06 PM
Fleet Size Planning in Crowdsourced Delivery: Balancing Service Level and Driver Utilization
We address the fleet size planning problem for crowdsourced delivery platforms, focusing on optimizing the number of crowdsourced drivers to balance the platform's service level and driver utilization. This is motivated by recent regulatory measures that limit the number of vehicle licenses a platform can hold, which caps fleet size as a means to enhance driver working conditions and mitigate congestion and emissions caused by an excessive number of idle vehicles on the road. We propose a two-stage optimization model where the first stage involves tactical decisions for determining fleet sizes, while the second stage captures the operational dynamics of the platform by a Markov decision process (MDP) in which uncertain crowd driver arrival in the MDP depends on fleet size decisions in the first stage, introducing decision-dependent uncertainty. To efficiently solve the model, we employ a value function approximation (VFA) algorithm that iteratively determines fleet sizes using Boltzmann exploration then approximates the second-stage MDP using a rolling-horizon parametric cost function approximation. Extensive computational experiments confirm the effectiveness of the VFA algorithm, demonstrating its ability to meet both service level and driver utilization targets under various conditions. The results show that a platform can achieve high driver utilization and meet its service level target while only marginally reducing its profit, especially when driver behavior is more predictable and fleet sizes are temporally adjusted.
-
02:06 PM - 02:28 PM
Approximate Dynamic Programming for Multiclass Scheduling Under Slowdown
In many service systems, service times of customers can be correlated with waiting times. Scheduling under such dependency is challenging as a Markovian state description requires keeping track of all customers' waiting history. We propose an approximate dynamic programming algorithm for multi-class scheduling with wait-dependent service times. Our algorithm can generate policies with simple structures and achieve strong performance which we illustrate in a healthcare setting using real data.