<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/">
	<channel>
		<title>A rough map of the new phase 2</title>
		<link>/blogentry/194478</link>
		<atom:link href="https://www.lotterypost.com/rss/blogcomments/194478" rel="self" type="application/rss+xml" />
		<description>hypersoniq's Blog: A rough map of the new phase 2</description>
		<dc:language>en-us</dc:language>
		<generator>Lottery Post RSS Generator</generator>
		<item>
			<title>Original Blog Entry: A rough map of the new phase 2</title>
			<link>/blogentry/194478</link>
			<guid isPermaLink="true">/blogentry/194478</guid>
			<pubDate>Mon, 08 Sep 2025 10:39:09 GMT</pubDate>
			<dc:creator>hypersoniq</dc:creator>
			<description><![CDATA[<p>The Markov Decision Process (MDP), this will be the framework for Phase 2, it will define the features, the goals and the constraints, the rewards and penalties.<br /><br />The actual navigation of the framework will require an AI agent driven by the reinforcement learning (RL) framework described by the MDP.<br /><br />So this is actually a 2 part mission. It may include giving the agent the ability to run the phase 1 script with changing variables. This is one of those things where the agent can output what settings were used so it is not a mystery.<br /><br />Confusing to be sure, but the analogy is that the RL agent is the maze runner while the MDP describes the maze.... &#x5b;&#xa0;<a href="/blogentry/194478">More</a>&#xa0;&#x5d;</p>]]></description>
			<category>Blog Entry</category>
			<category>hypersoniq</category>
			<wfw:comment>https://www.lotterypost.com/blogentry/194478</wfw:comment>
		</item>
	</channel>
</rss>

