Wouldn't make much sense. We generally train with 288 environments simultaneously. I've been thinking about ways to nicely stream all 288 environments though.
There's a ton of applications for AI. Back when I was at Spotify, I co-authored Basic Pitch (https://basicpitch.spotify.com/), an audio-to-midi library. There are a ton of uses for AI outside of what's heavily publicized.
Ok sorry I thought maybe there was a chance that the decomp project could edited in a way that would create a ROM that allowed RL to be done easier, but it seems like it just came in handy for looking up values along with the GB ASM tutorial, the alternative of my thought process is re-creating pokemon red in a modern language which you also mentioned
Wrote about this in the results section. I think there is a way to mix the two and simplify the rewards in the process. A lot of the magic behind getting the agent to teach and use cut probably could have been handled by an LLM.
The environments wouldn't concentrate enough in the Rocket Hideout beneath Celadon Game Corner. The agent would have the player wander the world reward hacking. With wild battles enabled, the environments would end up in Lavender Tower fighting Gastly.
> (and how on earth did you port Pokémon red to a RL environment? O.o)
Thanks haha, I kept reading =D I see, so it's not just that you have to visit the key areas, they need to show up in the episodes enough to provide a signal for training.