Interaction between environment and agent
- How is this implemented in Berkeley Pacman? There is a base game state (
GameState
) that encodes everything relevant about the environment: it includes a food grid, positions of all agents, the maze layout, the current game score. In the Search Problem, there is astartState
that includes only the Pacman’s position- After some more digging I questioned whether agents use problems internally to filter from global to local state? The main game loop passes the whole state to agents
getAction()
function, so it’s a little odd. Looks like this may have only applied to SearchAgents, and even then they only used problem in some other way (getAction
just uses an action index along the pre planned path). - In other cases, the
getAction
filters from theGameState
object directly. For example, in GhostAgents seen in project2, this game state object is passed to and used by the Ghost’s getAction, which makes direct use of built-in methods inside of GameState, likegetLegalPacmanActions
among others.
- After some more digging I questioned whether agents use problems internally to filter from global to local state? The main game loop passes the whole state to agents
- In most problems, there needs to be some process that shaves down the full world state to the relevant details for the problem at hand. In the Berkeley Pacman implementation, this is taken care of by “Problems”, which are classes that define a refined state space relevant to a specific problem. In their constructor they do whatever processing is necessary from the main game state to bring it into a local state that defines all the information relevant to the problem.
- But if I were to implement “problems” as a means of global to local conversion, does that mean I need to define a problem for each agent class? Different agents likely need different information from the game state, so things get confusing here (because of the inconvenience).
Interpretations and Observations
There is an interesting dilemma between environment and agents. When an agent makes an “observation” of an environment, what really happens? Does the environment broadcast information to the agent, in a way it can understand properly? Or does the agent “query” the environment for information in a specific form? Or does the information just exist, and agents must have the proper sensory channels to pick up on this information as it exists naturally? I think one would argue that the latter is most like reality: as humans, we have evolved sensory mechanisms to pick up on the information provided by our otherwise indifferent surrounding environment. Nevertheless, this global to local information “transfer” of sorts is interesting to think about, at least from a design perspective. In reality, it doesn’t seem much more complicated than, as beings physically living inside the environment itself, the information around reaches us naturally and our brains are then responsible for interpretation from that point onward. But how should this be facilitated when the agents aren’t actually inside the environment? The information projected by the environment can no longer naturally flow to them; instead, a mechanism of some kind must be designed to facilitate the transfer of information from the environment to a form understandable by the agent. While in practice this is not actually that complicated, the design of a semantically correct framework that appropriately separates duties for this interaction is more complicated than it may first seem. Is it the environment’s responsibility to convert the global state down to a form appropriate for the agent? Or is the agent responsible for the transformation, getting access to the global state and then having to filter/compute the view that it’s allowed to see? This is the question that has stirred much thought over the recent time spent developing the problib.
Implementation Solution
For now, I’ve more or less resolved this problem by allowing an intermediary View object to transform global state to local state, or “make” agent observations according to some internal logic. This transformation is facilitated by the surrounding Gym, and then delivered to the agent for action query. These View objects can be registered for specific Agent classes, and extendable like any other component of the library.
This approach allows the Env to remain concerned only with updating and maintaining the global state in whatever way necessary, and allowing Agents to remain responsible only for acting on the information they have immediately available to them. It prevents Agents from getting access to the entire world state (not semantically appropriate), while defining a transformation that also should be semantically outside of the agent’s control. In theory, Views could also define rendering schemes for generating visual views of an underlying environment. As of right now, this solution is one of the best I’ve encountered, remaining general and flexible while remaining semantically appropriate according to my liking. I’m happy with it.
Real Life
- Was thinking a little deeper about the nature of reality and what it means to physically be in the environment as an agent. There’s the whole discussion about how an agent is merely a decision process that receives all information from the environment, but it doesn’t really feel for itself. It’s not really there, inside of that world, absorbing that available information through it’s senses. It’s all facilitated, everything it feels is delivered entirely by the environment outside of its own mind. So this got me thinking a little more deeply about the real world, and how exactly we feel:
- Nerves are an extension of the brain
- What does it mean feel, what is the terminology? Does feeling have to occur in reality, only real physical presence? How can something be felt digitally? The facilitation of local to global state happens naturally in the real world, but must be done manually in a digital one
- The difference between sensation and feeling the presence of your body
- It’s a matter of interpretation, what the brain does with those signals. That’s the explanation for #3. Can they be replicated? Digitally, would that just mean delivering more info to the “brain”? Perhaps so, and it’s really a question of what that brain does with it that is “feeling”
- ghost limbs
- Thousand Brains Theory
Implementing constraints
Using a Rule object (?) to enforce logical restrictions on entity states - Also are entities part of the state? As in, you only need to change the entity object that is already embedded in the state to update? I think this should be fine, but so long as the updates are enforced properly