Environment registry

More Formal Approach

After more thought and attempting to actually implement the ideas above, the following describes a refined approach to nicely solving the agent-entity problem across envs and gyms. Overall we are embracing a few things

  • Full initialization of local details i.e. telling the env and gym about more parameters on initialization, heavily related to next item
  • Three stages of “registry”:
    1. Engrained defaults: default values specific to the particular env, “base structure”
    2. Custom initialization: parametrized base structure components, those things that impact the form of the env but are left to the client to specify, “parametrizable structure”
    3. Dynamic registry: components that come after the object is already initialized, likely unknown beforehand (otherwise would have been taken care of during initialization)

Env

Initialization: The Env class takes an options object that includes a lot more incoming values from the user. This options object looks for the following variables:

  • Parametrizable structure params: any parameters that the specific env is expecting to use to create the underlying environment structure. For example, the width, height, and node_list are examples of these parameters in the Grid environment. Note here that these can be optional as needed; the width and height in the example are not optional, while the node_list is.
  • entity_space: a dictionary of name: <Entity class> pairs for “telling” the Env about what entities are to be used and allowed. This also puts a user friendly string name to the object for use by other components, or general interpretability. Maybe eventually could be a Space object?
  • action_space: Space object defining possible actions (right now this pretty much just a list)

Additionally, there are internal environment defaults set within initialization.

Overall, the initialization structure of an environment has a few main logical categories:

  • Local structure: structure specific to the env i.e. new variables introduced to manage new components/logic
    • Local default structure: default values engrained in the environment. These are fixed components of the underlying Env (think the number of properties and their names on a standard Monopoly board)
    • Local parametrized structure: structure local to the env that can be dictated by the client. These parameters can be required or optional (think again from the Monopoly perspective: we’d likely require the client provide the number of players, while making it optional that they specify the amount of starting cash to override a default).
  • Global structure: structure maintained by the specific env, but is common to all envs
    • Global default structure: default global attributes with values set according to the nature of the env. For example, in Monopoly this might be having an action space (something all envs must have) that only allows a single jump at each time step (or perhaps a jump of between 2-12 spaces, seeing as a tick would likely move the player according to their roll; or it could take a single jump, ignoring all other actions until the jump sequence is complete).
    • Global parametrized structure: global attributes that the client has some control over.

Here I’ve debated allowing a “creation” section inside of initialization, but I think it makes more sense to defer that entirely to the create() method, even for the client. This is the method where entities are actually created by the env, for the env. It uses only those entities registered on initialization. It takes a dict of <class name>: {<options>} pairs, where the options dict includes

```
options = {
    'params`: {}   # dictionary of class initialization parameters
    'count`: <int> # number of <class name> objects with these parameters to create
    'group': <str> # internal group ID to assign this/these entities to
}
```

This is a fairly general and powerful creation scheme, that allows the env to control the entity creation process internally, using recognized entity classes, while offloading some of the load for the client by accepting things like count.

Registry pattern: Note that these options and methods embrace the three registry stages in different ways: - Class specification: 1. Engrained defaults: Env holds base set of name: class pairs for default recognized entities. 2. Parametrizable structure: takes entity_space param on initialization, allowing user to define their own entity class mapping. This essentially gets merged in the with the default map. 3. Dynamic classes: this does not exist by design. As of right now, I don’t have intent (at least explicitly through the API) to allow dynamic entity class registration after the env has been initialized. If I encounter a situation where I really need to tell the Env about a new class of entity on the fly, I will reconsider. - Entity creation 1. Engrained default: there are base entities created in the Env on initialization. This may not be super common, but for envs like the Grid it is necessary. This is really just some built-in logic on initialization that calls create() in some way or another. 2. Parametrizable structure: here things actually break down a little; I’ve taken away original plans to accept a “creation dict” on initialization. But since the create() method is built for this, it almost doesn’t make sense not to just let it be handled through that method immediately after initialization anyway. So this pretty much merges with “dynamic registration” since both stages use the same method, but it’s more about timing than anything. It is perhaps worth rethinking.

Selections: selections are collections of entities. This makes it easy to recover all entities of a certain class, or those assigned to a custom selection. This is especially important when it comes to registering agents in the Gym, and you don’t want to have to mess with all the details of finding entities and creating agents for those entities manually. The “levels” of collections are:

  • eid: a reference to a single entity, the lowest level of selection, allows you to get any specific entity directly
  • group: a collection of one or more entities, can be useful when you want to associate many similar entities but there is no nice way to do so with one of the other selection types
  • class: identify all entities that have a certain type. Like eids, these are built in to the nature of entities and does not require manual creation.

eids and entity class indexes are available by default, managed internally by the environment as entities are created through the create() method. However, groups are custom collections of entities that user must specify as seen in the options dict to create() above. These can be created by clients during their own entity registration, or internally by default for convenient use. Two main default groups are “all” and “default”, where “all” simply references all entities, and the “default” group are those entities created by the environment on initialization (e.g. the cells in a Grid env).

Indexes: indexes are an important part of efficiently managing entities within the env. Right now I intend on using in an ad hoc way, but could see them easily being formalized in a way similar to Views. Essentially you could tell the Env about entity attributes to track for a certain class, and create() will apply a function to the entity parameters/attributes to store that entity.

Base: The base env is to take care of all parameters shared by any inheriting envs. This includes parameters like entity spaces which, while specific to the inheriting env, can be handled commonly by the base. Any child env specific default entities are to be added by the child class itself. Note that I briefly considered the idea of jamming everything into an options class to be parsed exclusively by the Base env, i.e. allowing even child env specific variables to be processed using a basic loop and setattr. While this is convenient, it is both confusing (not explicit) and makes it non-trivial for me to enforce certain Env specific variables as required or optional.

So the Base class is responsible for taking an options object and extracting from it an action_space and entity_space. These two are required parameters of all environments, and it makes sense to pass these up to the base class for initialization. Note that child classes will only make the call up to super after mixing this parametric structure in with the inheriting env defaults in a way consistent with that class’ restrictions. For example, inside of the inheriting Env __init__, we define a default entity_space and action_space (if there is one). We will then mix these defaults in with the entity space or action space coming in from the client. In the case of the entity space, it’s common to merge the two, while in the case of the action space, we may either allow the user’s input to override the default, or enforce the default entirely and discard the client’s input. Note that this decision is left to the inheriting env, and thus we do not include any “merging” or “replacement” logic of existing and input parameters inside of the base environment class.