Solving the Traveling Salesman Problem with Unknown Edge Costs via Expert Demonstrations

Published:

Shubhankar Agarwal, Jianhan Song, Ashutosh Shukla, Josiah Coad, and Guni Sharon

We address the traveling salesman problem (TSP) with unknown edge costs. We assume that a set of demonstrative routes is provided. The set of demonstrative routes is assumed to be optimal with respect to the unknown edge costs. The goal is to optimally solve new TSP problems with respect to the unknown edge costs. We refer to this problem as the mimic traveling salesman problem (MTSP). We propose a novel algorithm for solving MTSP. Our proposed algorithm approximates edge costs through the provided route demonstrations. The approximated edge costs are then provided to an off-the-shelf TSP solver, which computes the least-cost route. We train the edge cost function such that the demonstrated routes result in the least-cost solution. To this end, we present a novel loss function inspired by inverse reinforcement learning. Importantly, the complexity of our proposed method is invariant to route length, scalable to large routing problems, and highly parallelizable in both its training and prediction phases. We show that our proposed algorithm’s learning parameters scale linearly with the number of features per edge. Additionally, we show that our algorithm converges within a bounded number of iterations in the case of a linear function approximator. Finally, we demonstrate state-of-the-art performance among other learning-based methods on the real-world Amazon last-mile delivery dataset.