Original Blog Entry: A rough map of the new phase 2

hypersoniq — Mon, 08 Sep 2025 10:39:09 GMT

The Markov Decision Process (MDP), this will be the framework for Phase 2, it will define the features, the goals and the constraints, the rewards and penalties.

The actual navigation of the framework will require an AI agent driven by the reinforcement learning (RL) framework described by the MDP.

So this is actually a 2 part mission. It may include giving the agent the ability to run the phase 1 script with changing variables. This is one of those things where the agent can output what settings were used so it is not a mystery.

Confusing to be sure, but the analogy is that the RL agent is the maze runner while the MDP describes the maze.... [ More ]