As a guest user you are not logged in or recognized by your IP address. You have
access to the Front Matter, Abstracts, Author Index, Subject Index and the full
text of Open Access publications.
Centralized Multi-Agent Reinforcement Learning (MARL) presents itself as an ideal framework for aggregation companies (e.g., Uber, Lyft, Deliveroo) that have to take a sequential set of centralized decisions on assigning individual agents (typically resources like taxis, food delivery personnel) to customer requests online in the presence of demand uncertainty. However, centralized learning is especially challenging in such very large scale environments, with thousands of agents/resources and hundreds of thousands of requests coming in each day. In this paper, we provide a novel value decomposition mechanism that is able to tackle the scale and provide high quality (matching) decisions at each time step. We show that our value decomposition approach, Conditional Expectation based Value Decomposition (CEVD) is more sustainable (requires 9.9% fewer vehicles to serve equal number of requests) and more efficient (serves 9.76% more requests, while traveling 13.32% lesser distance) than the current best approach over two different city scale (New York and Chicago) benchmarks for ride pooling using taxis.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.