Multi-ROM support for the ALE
This post describes my some new features I’m adding to the Arcade Learning Environment:
- Multiple ROMs: Run more ROMs and delegate execution to C++.
- Torch Interface: Coming soon.
Background
ALE-Py has gotten a lot of new features but not enough people know about the native C++ vector implementation (PR 599).
ALE-Py has had the AtariVectorEnv interface since April 2025, which adds support for standard preprocessing, asynchronous send/recv between the agent and the environment, and multiple instances of the same ROM; the ALE-Py vector environment docs are here for more info.
Here’s how you create the Python interface:
from ale_py.vector_env import AtariVectorEnv
# Create a vector environment with 4 parallel instances of Breakout
envs = AtariVectorEnv(
game="breakout", # The ROM id, not the Gymnasium environment id
num_envs=4,
)
Support for multiple ROMs (PR)
My PR extends the AtariVectorEnv to accept a list of ROMs, adding the games parameter in the constructor, num_envs can be used to duplicate the list.
Each ROM runs independantly: each ROM has its own episode state, autoresets independently, and terminates and truncations independantly.
Here is how you spawn multiple ROMs:
# with num_envs
envs = AtariVectorEnv(games=["pong", "breakout", "space_invaders"], num_envs=2)
# ["pong", "pong", "breakout", "breakout", "space_invaders", "space_invaders"]
# manually
envs = AtariVectorEnv(games=["pong", "pong", "breakout", "space_invaders", "space_invaders", "space_invaders"])
# ["pong", "pong", "breakout", "space_invaders", "space_invaders", "space_invaders"]
As full_action_space is False by default, each ROM keeps its minimal action set and so the number of valid actions per ROM can be different.
When ROMs have different action counts, single_action_space is None (there is no shared single space) and action_space is a MultiDiscrete with one count per ROM.
Here’s an example with four ROMs where the last three support 6 actions:
import gymnasium as gym
assert envs.single_action_space is None
assert isinstance(envs.action_space, gym.spaces.MultiDiscrete)
print(envs.num_actions) # [4, 6, 6, 6] - preferable to use num_actions
The chart below shows throughput versus latency using the multi-ROM feature on an AMD Ryzen 9 16-Cores. The test is run with unique ROMs per worker because the goal is to speed up experiments that require a full ALE suite of ROMs; slower games like Seaquest and Pitfall stall the step and pull throughput down relative to running faster ROMs like Pong. Packing all runs into a single job reduces the sequential speed of any single ROM but can massively increase the overall rate of completing an experiment. The line below is the mean and the shaded band is the 95% confidence interval with 10 independant runs.
Latency vs throughput as we increase the number of ROMs (CPU Only, All ROMs)
If you’ve been using EnvPool for its vectorised implementation of environments I’d recommend giving ALE-Py’s vector interface a shot as it is actively maintained.