arxiv:2403.00222

Efficient Reinforcement Learning for Global Decision Making in the Presence of Local Agents at Scale

Published on Mar 1, 2024

Authors:

Abstract

We study reinforcement learning for global decision-making in the presence of many local agents, where the global decision-maker makes decisions affecting all local agents, and the objective is to learn a policy that maximizes the rewards of both the global and the local agents. Such problems find many applications, e.g. demand response, EV charging, queueing, etc. In this setting, scalability has been a long-standing challenge due to the size of the state/action space which can be exponential in the number of agents. This work proposes the SUB-SAMPLE-Q algorithm where the global agent subsamples kleq n local agents to compute an optimal <PRE_TAG>policy</POST_TAG> in time that is only exponential in k, providing an exponential speedup from standard methods that are exponential in n. We show that the learned policy converges to the optimal policy in the order of O(1/k+epsilon_{k,m}) as the number of sub-sampled agents k increases, where epsilon_{k,m} is the Bellman noise. We also conduct numerical simulations in a demand-response setting and a queueing setting.

View arXiv page View PDF Add to collection

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2403.00222 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2403.00222 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2403.00222 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.